Client Background

Client: A Leading Tech Firm in the USA

Industry Type: IT & Consulting

Services: Software, Business Solutions, Consulting

Organization Size: 200+

Project Objective

Migrate existing databases from Postgres to elastic search since Elasticserach performs better in search operations. In addition to this, all of the backend javascript also needed to be changed in order to query the new elasticsearch database.

Project Description

The client’s website was a visualization tool. It also had GUI to add filters. To make the visualizations, at least 50,000 records needed to be pulled from the Postgres database whose size would be around 200mbs. This would take a lot of time (nearly 20-30 secs). Adding filters would take additional time. So our task was to move the entire database over to Elasticsearch from postgres since it is way more faster in search operations and also filtering data. Since the database was changed, we also had to write new backend code that would now query the Elasticsearch database.

Our Solution

  1. Setup ELK stack (Elasticsearch, Logstash, Kibana) on AWS EC2 instance.
  2. Write a pipeline file (.conf file) which is used to ingest data from postgres to elasticsearch. The datatypes of cloumns, unique constraints, datetime formats etc., are all defined in this file. This is executed with the help of logstash. 
  3. Once the data is inserted, it can be queried in the kibana’s built in query compiler. Here we can check the veracity of the data.
  4. Identify the code in the backend that needs to be changed.
  5. Replace this code with new code that would now query elasticserach. We use elastic_query_builder module for this.
  6. Testing Postgres and Elasticsearch performance.

Project Deliverables

  1. Setup ELK stack (Elasticsearch, Logstash, Kibana) on AWS EC2 instance.
  2. Pipeline i.e; logstash file
  3. New working backend code for elasticsearch
  4. Commands to check elastic data.
  5. Customizable logstash pipeline

Tools used

Elasticsearch

Postman

Kibana

Logstash

Python

Javascript

Amazon Web Services

Postgres

Docker

Git Bucket

Github

Language/techniques used

Javascript

Json

Domain-Specific Language for elasticsearch

bash

Skills used

Elasticsearch query knowledge

Postgres query knowledge

Networking

Javascript

Backend web stack

Databases used

Postgres

Elasticsearch

Web Cloud Servers used

Amazon Web Services (AWS)

What are the technical Challenges Faced during Project Execution

  1. Sometimes for large responses from elasticsearch ( size above 500mb), time taken was above 30 secs.

How the Technical Challenges were Solved

To solve the above mentioned problem, we used gzip in the request url’s header. This significantly reduced the execution times.

Business Impact

Earlier postgres infrastructure which took around 20-30 secs now too consistently less than 10 secs to perform filter and search operations. This would contribute to a better user experience.

Project Snapshots