Client Background

Client: A leading tech firm in the USA

Industry Type:  IT

Services: SaaS, Products

Organization Size: 100+

Project Description

Businesses now have more access to data than ever before in today’s digital economy. This information is utilised to make key business choices. Businesses should invest in data management systems that increase visibility, dependability, security, and scalability to ensure that workers have the required data for decision-making. The client wanted to get the data management process automated using a tool from Python. Multiple operations like merging,sorting, filtering had to be performed on data from various resources. The data resources were mainly csv files and data from SQL queries in PostgreSQL. 

Our Solution

The project solution contained two tools that would aid in automatic efficient data storage. The first tool will concatenate all of the CSV files before merging them with the data from the SQL file. The acquired Excel file will be used as input for the second tool. The second tool will sort, filter, and lookup the Excel file received in the first tool. This tool will add columns that will be useful for the client’s analysis. The major goal is to assist the client with data management while requiring as little manual labour as possible. The files obtain the needed data in an Excel file by giving the proper input files.

Project Deliverables

The project deliverables can be divided into two parts:

  1.  Excel Tool1: ExcelTool1 generates an Excel file that contains two sheets RSLTS IN and RSLTS OUT. The RSLTS IN is obtained by concatenating all the csv files in the Output folder. The RSLTS OUT is the result of merging the data from vwr egeas.sql query and RSLTS IN. 
  1. Excel Tool2: Excel Tool2 creates another Excel file with one sheet RSLTS and csv files like vwr_instructions_new table, vwr proto and INST_RTR. This tool performs excel operations like lookups, arithmetic calculations and merging of data from multiple sources.

Tools used

  • For the whole data management and automation, we have made our own tool by python scripts.
  • PostgreSQL was used to merge the csv files provided by the client with the python scripts. 
  • The automation tool will store data in the excel sheets.

Language/techniques used

  • PyCharm for compiling and running the code.
  • The scripts for the automation tool were written in the Python programming language.
  • OS, glob, pandas, numpy and psycopg2 were thePython libraries used in the project.

Skills used

  • Configuration and Data moving using PostgreSQL.
  • Automation of tools
  • Exception Handling from Python

Databases used

Two types of databases were used: Google excel sheets and PostgreSQL.

What are the technical Challenges Faced during Project Execution

Some minor challenges were faced such as data discrepancies generated during the automation process.  

How the Technical Challenges were Solved

The challenges were solved by reworking on the automation tool and consulting with the clients for their requirements. 

Business Impact

It is critical to use appropriate data management procedures to ensure the smooth running of a firm. Furthermore, data management must be very precise, cost-effective, and completed as soon as possible. The inability to handle data can result in costly consequences and a permanent stain on the company’s image. Every company is responsible for developing a robust data management plan. The following are some of the reasons why data management is critical to the success of the firm. Instant Availability of Information: Data management makes information easily available for quick access based on company needs. Data management is also essential for accounting procedures like auditing and other strategy-based operations like company planning. The more time you spend hunting for misplaced files and missing documents, the less productive you will be. And you are aware that time is money. Keeping all of your documents structured might therefore assist to make procedures run more smoothly and quickly. Compliance: The government passed legislation requiring businesses to maintain these data. There are also periodical checks to verify that there is no manipulation. Furthermore, if a corporation is involved in a dispute, they must maintain these records for years until a solid verdict on the matter is reached. Faster Transitions to New Technology: Because technology trends change so quickly, organizations must embrace whatever comes their way. Losing information due to obsolete or outdated systems is the last thing you want for your company. Every piece of data preserved in the firm records is essential for everyday operations, managing multiple divisions, completing computations, audits, and so on. Make Right Business Decisions: Businesses use a variety of information sources for company planning, trend research, and performance management. To execute the same activity, different departments’ teams employ different sources of information. Because the legitimacy and precision of information are highly dependent on the source, analyzing several sources may have a detrimental influence on the organization. Robust data management prevents this from happening.

Project Snapshots

Fig.1: Python code of Exceltool1

Fig.2: Python code of Exceltool1

Fig.3: Python code of Exceltool2

Fig.4: Python code of csv tables

Fig.5: RSLTS_OUT worksheet in output Exceltool1

Fig.6: RSLTS worksheet in output Exceltool2

Fig.7: RSLTS worksheet in output Exceltool2

Fig.8: INST_RTR table as output from Exceltool2

Contact Details

Here are my contact details:

Email: ajay@blackcoffer.com

Skype: asbidyarthy

WhatsApp: +91 9717367468

Telegram: @asbidyarthy 

For project discussions and daily updates, would you like to use Slack, Skype, Telegram, or Whatsapp? Please recommend, what would work best for you.