Client Background
Client: A leading tech firm in the USA
Industry Type: IT
Services: SaaS, Products
Organization Size: 100+
Project Description
Businesses now have more access to data than ever before in today’s digital economy. This information is utilised to make key business choices. Businesses should invest in data management systems that increase visibility, dependability, security, and scalability to ensure that workers have the required data for decision-making. The client wanted to get the data management process automated using a tool from Python. Multiple operations like merging,sorting, filtering had to be performed on data from various resources. The data resources were mainly csv files and data from SQL queries in PostgreSQL.
Our Solution
The project solution contained two tools that would aid in automatic efficient data storage. The first tool will concatenate all of the CSV files before merging them with the data from the SQL file. The acquired Excel file will be used as input for the second tool. The second tool will sort, filter, and lookup the Excel file received in the first tool. This tool will add columns that will be useful for the client’s analysis. The major goal is to assist the client with data management while requiring as little manual labour as possible. The files obtain the needed data in an Excel file by giving the proper input files.
Project Deliverables
The project deliverables can be divided into two parts:
- Excel Tool1: ExcelTool1 generates an Excel file that contains two sheets RSLTS IN and RSLTS OUT. The RSLTS IN is obtained by concatenating all the csv files in the Output folder. The RSLTS OUT is the result of merging the data from vwr egeas.sql query and RSLTS IN.
- Excel Tool2: Excel Tool2 creates another Excel file with one sheet RSLTS and csv files like vwr_instructions_new table, vwr proto and INST_RTR. This tool performs excel operations like lookups, arithmetic calculations and merging of data from multiple sources.
Tools used
- For the whole data management and automation, we have made our own tool by python scripts.
- PostgreSQL was used to merge the csv files provided by the client with the python scripts.
- The automation tool will store data in the excel sheets.
Language/techniques used
- PyCharm for compiling and running the code.
- The scripts for the automation tool were written in the Python programming language.
- OS, glob, pandas, numpy and psycopg2 were thePython libraries used in the project.
Skills used
- Configuration and Data moving using PostgreSQL.
- Automation of tools
- Exception Handling from Python
Databases used
Two types of databases were used: Google excel sheets and PostgreSQL.
What are the technical Challenges Faced during Project Execution
Some minor challenges were faced such as data discrepancies generated during the automation process.
How the Technical Challenges were Solved
The challenges were solved by reworking on the automation tool and consulting with the clients for their requirements.
Business Impact
It is critical to use appropriate data management procedures to ensure the smooth running of a firm. Furthermore, data management must be very precise, cost-effective, and completed as soon as possible. The inability to handle data can result in costly consequences and a permanent stain on the company’s image. Every company is responsible for developing a robust data management plan. The following are some of the reasons why data management is critical to the success of the firm. Instant Availability of Information: Data management makes information easily available for quick access based on company needs. Data management is also essential for accounting procedures like auditing and other strategy-based operations like company planning. The more time you spend hunting for misplaced files and missing documents, the less productive you will be. And you are aware that time is money. Keeping all of your documents structured might therefore assist to make procedures run more smoothly and quickly. Compliance: The government passed legislation requiring businesses to maintain these data. There are also periodical checks to verify that there is no manipulation. Furthermore, if a corporation is involved in a dispute, they must maintain these records for years until a solid verdict on the matter is reached. Faster Transitions to New Technology: Because technology trends change so quickly, organizations must embrace whatever comes their way. Losing information due to obsolete or outdated systems is the last thing you want for your company. Every piece of data preserved in the firm records is essential for everyday operations, managing multiple divisions, completing computations, audits, and so on. Make Right Business Decisions: Businesses use a variety of information sources for company planning, trend research, and performance management. To execute the same activity, different departments’ teams employ different sources of information. Because the legitimacy and precision of information are highly dependent on the source, analyzing several sources may have a detrimental influence on the organization. Robust data management prevents this from happening.
Project Snapshots
Fig.1: Python code of Exceltool1
Fig.2: Python code of Exceltool1
Fig.3: Python code of Exceltool2
Fig.4: Python code of csv tables
Fig.5: RSLTS_OUT worksheet in output Exceltool1
Fig.6: RSLTS worksheet in output Exceltool2
Fig.7: RSLTS worksheet in output Exceltool2
Fig.8: INST_RTR table as output from Exceltool2
Contact Details
Here are my contact details:
Email: ajay@blackcoffer.com
Skype: asbidyarthy
WhatsApp: +91 9717367468
Telegram: @asbidyarthy
For project discussions and daily updates, would you like to use Slack, Skype, Telegram, or Whatsapp? Please recommend, what would work best for you.