The banking industry is going through a transformational journey with the comprehensive usage of Advanced Analytics algorithms in day to day business of core banking. Customer acquisition through various channels, existing customer engagement, predicting defaulters on credit card or loan applications, etc are few of the areas where analytics is doing a tremendous job. I will explain some scenarios from my past experience of working in the advanced analytics team of a leading multinational bank where we had used some interesting concepts from analytics and machine learning to solve complex business problems.
Customer is God (or is it so?)
It is impossible to track God, but you can track your potential customers with the help of advanced analytics algorithms. A lot of information about their buying patterns, demographics, transactions, service requests, etc are available with banks. This is being used efficiently to predict the propensity of a customer buying a specific product. While working in the analytics department, I collaborated with many sales teams which needed a rank-ordered list of potential future customers for direct mailer/cold calling campaigns. The campaigns usually contain an attractive offer specific to the product, like lower interest rate for a credit card, higher interest on savings account, etc. which are hedged by confidence that the significant number of customers from the ranked list would be up for buying the product, as opposed to the ones being contacted randomly. Not only potential customers for a product, but also which customers are having the mindset to close their account (Customer Attrition) is being predicted.
Data science algorithms such as Logistic Regression work well in predicting the probability of customer propensity to buy or customer attrition probability. For example, consider the below real-life business problem:
A leading MNC bank wants to formulate a strategy to curb the customer attrition, which is constantly on the rise, for their Savings Account product. They reach out to the Advanced Analytics department to help them in retaining their best customers by predicting which one of them is having a propensity towards closing their account so that they can connect with them and offer attractive offers to continue their invaluable relationship.
The Advanced Analytics team starts with first narrowing down the problem and defining Attrition, for eg- customers taking out their parked money, closing their Savings Account, and not reinvesting their money with another product offered by the same bank. The business wants to predict attrition 3 months in advance to get sufficient time to design a retention strategy.
The team then moves on to collect the required data, which usually takes up most of the time in the project as questions like which data to collect, for how much time period, what additional data is needed, etc. are very critical requirements. Once the stage is set, the Data Scientists perform their final act – Machine Learning. The algorithms will read the data and find out patterns leading to attrition behavior based on previous account closures. It will then leverage this learning on newly available customer information in the current scenario and predicts the probability of attrition.
I came across an astounding project predicting customer propensity to buy retail banking products 10 years from today! Guess what was the driving feature in these algorithms? That’s right, current educational background and achievements tracked on business social media websites lead to understanding where the person would want to park their money in the future.
One very interesting use of Natural Language Processing algorithms like Naïve Bayes classifier in retail banking is into analyzing transactions. Each transaction is associated with a code and a short description. Data scientists use language parsing to extract keywords or tokens from the humongous text generated to read through the description and discover more about the transaction. It can be identified whether you have paid for a product at a retail store, transferred money to another bank account that you hold or paid the bill of a credit card from another bank. The transactions can be then linked to how you value your current relationship with the bank and the business can decide upon various customer outreach strategies based on this critical information.
They buy what they see
According to the laws of economics, choices spoil us. The more the options in front of customers, the more confused they become and the more chances of not buying the product. A very niche field that makes use of hardcore machine learning algorithms is Targeted Digital Marketing, and retail banking is constantly using this to identify and catch potential customers visiting the website by displaying customized web content which caters to the needs of visitors and offer products they are looking for. How does the business know what the customer wants? Digital Footprints!
Millions of website visitors generate thousands of GBs of data that contain information such as what channel they come from, which is the most frequent landing page on the bank’s website, which pages they scan through, how much time they spend reading the content, how many unique visits they make per day or week etc. This data is called the Digital Footprint of a website visitor, also called Clickstream Data. Banking and financial institutions have been churning this data to generate accurate information about what exactly the visitor is looking for, and they efficiently display customized content to the visitors with products that they would be interested in displayed on the home page. This prompts the visitor to click on the display link and understand more about the product, as the saying goes, they buy what they see. You should not display too much content altogether as it will confuse the visitor and they might log off without making any deal, or without leaving their information to be contacted later. The information from digital footprints can be customized to the extent of what % rate the visitor would convert on! Customized content is displayed based on which current visitors show characteristics of other visitors in the past who got converted. Since the conversion rate is minimal as only a few out of millions of visitors on the bank’s website actually buy a product or service online, we need to use some very powerful machine learning algorithms that can catch the pattern with the limited amount of data to learn on. Also, there is a huge amount of data that needs to be read through by the algorithm as digital footprints get generated with each second. Machine Learning algorithms such as Random Forest (which use Bagging methodology) or Gradient Boosting (which use Boosting methodology) come to the rescue. These algorithms are efficient in handling huge amount of data and can identify patterns with good accuracy. Data Scientists also use a technique called Stacking where outputs of different algorithms are combined and fed into another model to calculate probability as a combination of various machine learning algorithms.
So, the next time you visit a bank’s website and pop-up displays offering an attractive interest rate on the home loan, be assured that a heavily loaded machine learning algorithm running in the background has monitored your activity on the website and knows what you are looking for.
Clustering – the one stop shop for analytics solutions
Majority of decision making in Banks and Financial Institutions is done by dividing things into groups that have similar characteristics and behave in a particular way, so that decisions may be applied in batches which saves time, energy and of course money. Clustering algorithms to the rescue! I cannot count the number of times we proposed clusters as a solution to a business problem but would share two instances I distinctly remember where we did so.
The first one was for a bank which wanted to identify its most loyal customers from, well, not so loyal. Our approach was to define loyalty based on three metrics: recency of transactions, frequency of transactions and the monetary value of transactions (better known as RFM model in analytics world). These three metrics combined, with a certain weight assigned to each one based on what is more important to the business, enable us to rank which customers are interacting more as compared to others (who eventually are dormant and fall in not so loyal category). An important consideration is to study the transactions in depth otherwise errors might creep in. For instance, and this is very interesting, when we developed the categories based on RFM score, we saw every single savings account holder scored very high on recency value. This was very intriguing for us and we spent days in identifying what was happening, only to later realize that every savings account has an interest receivable by the account holder, and it credits to the account at the end of each month. The trade-offs of being too technical I suppose.
The second instance was when the business wanted to identify bank branches that have lower than average performance, on metrics like new customers acquired, average balances maintained in accounts, quality of home loan applications sent to head branch for approval, promoter score of customers, employee attrition, etc. Using k-means clustering, we identified and differentiated all the high performing branches on these metrics from the lower ones. This helped the bank in taking decisions like reviewing the financial targets for lower performing branches or provide focus on employee training, and also incentivize the high performing branches to inspire others.
The core idea behind all advanced analytics and data science algorithms is to help the business make better decisions faster. They say data is the new oil, and analytics is the process of converting oil to fuel.
Blackcoffer Insights 9.0, Chirag Soni, IIM, Bangalore