Data has many forms i.e. primary data, secondary data, experimental data, sales data, transactional data, operational data, strategic data, log data and much more. Data has the potential to transform business and drive the creation of business value. Data can be used for a range of simple tasks such as managing dashboards i.e. operation, strategic and analytical or visualizing relationships. However, the real power of data lies in the use of analytical tools (BI tool, R, Python, excel etc.) that allow the user to extract useful knowledge and quantify the factors that impact events. Some examples include Customer sentiment analysis, customer churn analytics, geospatial analysis of key operation centers, workforce planning, recruiting, smart planning, or risk-sensing.
Analytical tools are not the discovery of the last decade, in fact, it has been there for centuries. Statistical regressions and classification models have been around for the best part of the 20th century. It is, however, the explosive growth of data in our times combined with the advanced computational power that renders data analytics a key tool across all businesses and industries.
In the Financial Industry, some examples of using data analytics to create business value include fraud detection, customer segmentation, churn prevention, insurance premium modeling, risk modeling, credit score modeling, and employee or client retention.
In order for data analytics to reveal its potential to add value to the business, a certain number of ingredients need to be in place. This is particularly true in recent times with the explosion of big data (big implying data volume, velocity, and variety). Some of these ingredients are the listed below:
Distributed file systems
The analysis of data requires some sophisticated IT infrastructure to support the work and motion. For large amounts of data, the market standards are platforms like Apache Hadoop which consists of a component that is responsible for storing the data Hadoop Distributed File System (HDFS) and a component responsible for the processing of the data MapReduce. Surrounding this solution there is an entire ecosystem of additional software packages such as Pig, Hive, Spark, etc. Could solutions such as Google cloud, AWS cloud, Azure and more be making the whole system way easier than it used to be before.
Database management
An important aspect of the analysis of data is the management of the database. An entire ecosystem of database systems exists: such as relational, object-oriented, NoSQL-type, graph-database, neo4j, etc. Well, known database management systems include SQL, Oracle, Sybase, MongoDB, Graph-database. These are based on the use of a primary key to locate entries. Other databases do not require fixed table schemas and are designed to scale horizontally. Apache Cassandra, for example, is designed with the aim to handle big data and have no single point of failure.
Advanced analytics
Advanced analytics refers to a variety of statistical methods that are used to compute likelihoods for an event occurring. Machine Learning, Artificial Intelligence, Data Integration, Statistical Analysis can solve problems way beyond we can think of. Popular software to launch an analytic solution is R, Python, Java, SPSS, etc. The zoo of analytics methods is extremely rich. However, as data does not come out of some industrial package, human judgment is crucial in order to understand the performance and possible pitfalls and alternatives to a solution.