Client Background
Client: A leading tech firm in the USA
Industry Type: Retail
Services: Retail business
Organization Size: 100+
Project Objective
The objective of this project was to convert dirty JSON data present in a CSV file to a readable CSV file. The CSV file contained data in JSON format, which was split into columns in an Excel file, making it hard to read. The client wanted the data to be extracted and converted into a readable format to perform further analysis on it.
Project Description
Our client had provided us with a CSV file that contained data in JSON format, which was split into columns in an Excel file. The data was hard to read and understand, making it difficult to perform any analysis on it. Our objective was to extract the data, convert it to a readable format, and validate the JSON file to ensure that it was in a correct format. Finally, we had to convert the JSON data into a CSV file that could be easily read and analyzed.
Our Solution
To extract the data, we used Python programming language and Pandas library. We extracted every piece of text present in the Excel sheet using Pandas and converted it into a readable text format. We then validated the JSON file with a JSON validator website to ensure that it was in the correct format. Finally, we used Pandas again to convert the JSON data into a CSV file that could be easily read and analyzed.
To perform the conversion, we used Jupyter Notebook, Json Validator, and Microsoft Excel.
Project Deliverables
The final deliverable was a readable CSV file that contained the converted data from the original JSON format.
Tools used
Jupyter Notebook, Json Validator, and Microsoft Excel.
Language/techniques used
Python programming language and Pandas library.
Skills used
Python programming and Pandas data manipulation.
What are the technical Challenges Faced during Project Execution
The main technical challenge we faced during the project was dealing with dirty JSON data present in a CSV file that was split into columns in an Excel file. This made it hard to read and understand, and required extra effort to extract the data and convert it into a readable format.
How the Technical Challenges were Solved
We solved the technical challenges by using Python programming language and Pandas library to extract and manipulate the data. We validated the JSON data using a JSON validator website to ensure that it was in the correct format. Finally, we used Pandas to convert the JSON data into a readable CSV file that could be easily analyzed.
Business Impact
The business impact of this project was that the client was able to perform further analysis on the extracted data in a readable format, which was previously hard to read and understand.
Project website url
https://colab.research.google.com/drive/1yWDj8_HXu6hOYatrzWQ3ezqBxsUON3JY
Here are my contact details:
Email: ajay@blackcoffer.com
Skype: asbidyarthy
WhatsApp: +91 9717367468
Telegram: @asbidyarthy
For project discussions and daily updates, would you like to use Slack, or Skype or Whatsapp? Please recommend, what would work best for you.