Client Background

Client: A leading marketing firm in the USA

Industry Type: Marketing services & Consulting

Services: Marketing consultant

Organization Size: 100+

Project Objective

For this project objective was to perform API Data Abstraction using Google LSA API in GCP, Automation of data fetching and storing in BigQuery on daily basis, Storing Historical data for all active companies, Fetching Customer Report then storing data on daily basis in BigQuery also storing Historical data for all companies, Perform Linear Regression Modelling on Historical data for all companies and storing the modeling Summary in google sheet in a structured manner, Basecamp Automation with LSA Daily Data, Creating 4 BI Dashboard in Data Studio for Live, Historical, Modelling and Customer Report data for all companies. 

Project Description

For this project task was to obtain an account report and detailed lead report for a specific dates and customer_id using Google Local Service Ads API Service in Google Cloud Platform. Further need to integrate with Google BigQuery database storing MCC data for all companies on a daily basis then storing Historical data for all active companies. Also notifying clients through email and passing messages containing daily account data in a message format to BaseCamp Message Board and Campfire of respective company projects through its API all with python programming, further deploying the script on Heroku Server for automating all this task. Then Creating BI Dashboard in Data Studio connecting with BigQuery and Creating Live Dashboard, Historical Dashboard for all companies. 

On historical data for all companies, Linear Regression Modelling needs to perform and to create Modelling Dashboard for all companies in Data Studio. Further needs to do  Exploratory Data Analysis for all companies on Historical Data. 

To Store Customer Account Report for message lead and phone lead on a daily basis, Script needs to be created and deployed in Heroku and also need to store Historical data for these companies and Finally Create Data Studio Dashboard on it.

Creating Sales Representation Dashboard for two Companies which involves multiple Reports and blending of multiple data sources from Big Query.

Our Solution

>> Module 1: API Data Abstraction

Which first includes generation of the access token and refresh token with the scope of Google AdWord API for the authentication and connecting with Google LSA API. Then fetching daily data in JSON format for particular account name based on customer_id assigned in API URL while fetching data. Likewise generating a script that would Handle data generation for all other active accounts based on their customer id.

>> Module 2: Data Imputation and Storing

Converting the JSON data to the pandas data frame forming a list of data frame for all the active accounts by looping them then deriving certain more attributes based on their handling the missing and inf values. Finally storing the data in Google Big Query database within the respective table for all accounts using Bigquery API.  

>> Module 3: Data Storing in BigQuery and Notification Automation

The task was to automate notifications sent to email and to Basecamp and the data transferred to the database on a daily basis by deploying the script to Heroku Server setting time parameters based on the New York time zone.

>> Module 4: Automation tools created till now:

i. LSA_AccountReport_daily_BigQuery tool: For Automation of Account Report for all companies on a daily basis. Scheduling it at 1:00 am in the Los Angeles Timezone.

ii. LSA_AccountReport_Historical_API tool:  For Storing Historical Data for companies for the last few Years till the end date which we set.

iii. Basecamp_lsa_automation: This is used to pass the lsa data in a message format to Campfire for respective companies groups and store lsa data combined for all companies to Messageboard and Campfire at one Automation Python Group in Basecamp.

iv. LSA_DateRange Tool: Used to store missed out data for all the companies for a few sets of days or months as per the need.

v. LSA_MainSheet_AutoUpdation tool: For Auto updation of main sheet  ‘LSA Client Lead’  Google Sheet. As Daily Data are fetched on the basis of this list so it is required to auto update this sheet for all the new companies entered would store information of those like company name, account id and database name.

vi. LSA_daily_CustomerReport tool: Created to Store LSA Customer Report for all companies in database ‘CustomerReport_PhoneLead’ & ‘CustomerReport_MessageLead’ on daily basis.

vii. Historical_LSA_CustomerReport tool:  Created to Store LSA Customer Report for all companies in database ‘CustomerReport_PhoneLead’ & ‘CustomerReport_MessageLead’ storing historical data for year 2021.

>> Module 5: Data Studio BI Dashboards Created:

i. Historical Dashboard

ii. Live Dashboard

ii. Customer Report Dashboard

iii. Modelling Report Dashboard

iv. Sales Representation Dashboard

Project Deliverables

  1. Data Studio Dashboard Main Sheet
  2. All Codes for the Deployed tools and for Modelling EDA and Test Purpose .

Tools used

● PyCharm

● Jupyter Notebook

● Anaconda

● Heroku

● Notepad++

● Google Sheet API

● Google LSA API on GCP

● Google BigQuery

● Sublime Text

● Brackets

● JsonViewer

Language/techniques used

● Python

● SQL

Models used

My project ‘Google Adword LSA API Reports automation into Google Big Query database and Basecamp’ developed with a software model which makes the project high quality, reliable and cost-effective.

● Software Model: RAD(Rapid Application Development model) Model

● This project follows a RAD Model as our model is not forming the loop from end to the start, also my project was based on prototyping without any specific planning. In the RAD model, there is less attention paid to the planning and more priority is given to the development tasks. It targets developing software in a short span of time.

● Advantages of RAD Model:

o Changing requirements can be accommodated.

o Progress can be measured.

o Iteration time can be short with the use of powerful RAD tools.

o Productivity with fewer people in a short time.

o Reduced development time.

o Increases reusability of components.

o Quick initial reviews occur.

o Encourages customer feedback.

o Integration from the very beginning solves a lot of integration issues

Skills used

● API Data Abstraction

● Data Mining and Statistical Modelling

● Data Wrangling

● Deployment for Automation

● Data Visualization

● SQL

● Machine Learning

● Python Programming including OOPs and Exception Handling

Databases used

● Google Firestore (Just for Testing Purpose)

● Google BigQuery

Web Cloud Servers used

Google BigQuery Cloud Database with up to 1 TB of free storage is being used. 

What are the technical Challenges Faced during Project Execution

● Scheduling Automation of Python Script.

● Data Exceptions and Duplication in BigQuery Tables.

● Refresh token Expiration After 7 Days.

● Data Exception due to Inactive companies or not Updation of LSA Main sheet. 

● Basecamp ProjectId Issue for transferring Data to multiple companies projects.

● Data Studio Time Series Plot data mismatch due to multiple account id.

How the Technical Challenges were Solved

● Scheduling Automation of Python Script.

Python Library BlockingScheduler were used and the Timezone variable ‘TZ’ was set to Los Angeles in Heroku

● Data Exceptions and Duplication in BigQuery Tables.

       Structuring SQL Query to deal with all the database issues which were being used in BigQuery to solve those issues.

● Refresh token Expiration after 7 Days.

Initially ‘Auth Playground’ was used for generating Refresh token which was getting expired after every 7 Days so to last it longer for more than a year we are now using the refresh token which was generated using Python script where proper token endpoints and many other headers were defined before generating the refresh token.

● Data Exception due to Inactive companies or not Updation of LSA Main sheet. 

Data Exception occurred while API data abstraction for few of the companies which were solved by adding more nested try and except statements after understanding issues also ‘LSA Clients Lead’ main sheet was not being updated by other members due to which we missed out data for few of the companies which were solved by creating script which will automatically update the mainsheet when an error occurred.

● Basecamp ProjectId Issue for transferring Data to multiple companies projects.

This issue was solved by creating Basecamp Main sheet where data was fetched now by mapping the account id of fetched data using LSA Main sheet and project id of all the basecamp companies.

● Data Studio Time Series Plot data mismatch due to multiple account id.

Solved by adding many parameters like setting the metrics which will do a summation of all the companies on a particular day for all the account id.