Organizations are currently inundated with big data that consists of many attributes about their customers or prospects. These attributes range from hundreds to thousands of structured variables that are obtained from a myriad of databases as related to transactions, billing history, financial activity, spending capacity, demographics, credit behavior, credit scores, customer sentiment, product usage, transactions, deceased records, or Office of Foreign Assets Control. To add to the above-mentioned attributes, organizations are engulfed with unstructured data obtained from call center logs, customer surveys, and even social media.
With organizations receiving gigabytes, terabytes, or petabytes of the above-mentioned data on a daily, weekly, or monthly basis, it has become an arduous task to sift through this vast amount of data to glean intelligence for making optimal business decisions. These business decisions span strategies for target marketing, fraud detection, account acquisitions, cross-sell or upsell, risk management, customer retention, payment behavior analysis, collection prioritization, debt recovery, and many other business functions.
Data mining is one solution that can help organizations optimize business strategies for making more profitable decisions. Let’s look at some of the critical preliminary steps, as well as the actual process of data mining.
Before proceeding, there will be a point when the disparate datasets will need to be prepared for data mining or predictive analytics. For example, it is common that an organization receives hundreds of thousands of files daily, weekly or monthly that need to be combined into a few. However, there are considerations to take before or after joining these like checking for duplicate records or payments, imputing missing values, transforming variables, binning variables, sampling large databases into smaller, more manageable ones, performing variable reduction, and/or splitting files into training and validation datasets. These or other data preparation functions are necessary to obtain the most value from your wealth of data.
A critical step in the data mining process is learning more about your customers, good or bad. The intent is to accurately identify your customers in detail, by segment. The description of your customers should be crystal clear to your business strategists to enable them to accurately execute target marketing. From selling the right bundle of products or services to minimizing delinquencies and properly segmenting your customers is critical.
For example, the following are 20+ key questions that should be considered for answering, by customer segment, while designing your data mining strategies:
Traditional statistical techniques would be very useful to help describe your data. For example, descriptive statistical measures such as averages, medians, modes, ranges, variances, standard deviations, percentiles, and other techniques would be ideal for describing your data. In addition, inferential statistical techniques such as classification or segmentation, regression analysis, forecasting systems, or recommendation systems would be ideal for testing various hypotheses about your data or making accurate predictions.
Furthermore, some of the more advanced machine learning techniques would be essential here, too. Examples include decision tree learning, association rule learning, support vector machines, genetic algorithms, Bayesian networks, deep learning, clustering, or reinforcement learning.
For unstructured data, tools related to text mining, sentiment analysis, or content categorizations are vital. These tools provide your organization with the “softer” side of how customers or prospects think about your products or services that may not be reflected in the structured data. For example, customers could be posting positive or negative feedback on social media about their experiences with your products or services. This valuable information could be used to fix problems that will reduce attrition or lead to valuable product features that could significantly increase sales.
In addition, visualization tools that use dashboards which consist of charts, graphs, tables, or plots are also critical in data mining. These tools help you see trends, patterns, outliers, or correlations that may be more conducive to various analysts, management, executives, auditors, or stakeholders.
Well integrated or self-service tools would be ideal for accomplishing the aforementioned tasks of ensuring data accuracy, preparing data, and performing data mining. Moreover, tools that can easily handle big data are ideal if an organization has this volume of data to sift through.
An ideal solution provides an end-to-end process that acquires data from any data source and allows the user to prepare it for mining. As noted above, your solution has to have the ability to apply advanced analytics to gain as much insight as possible. This type of solution would then enable the user to act on the data in the needed way. Finally, the perfect solution would also enable users to automate certain data processing executions to mirror the work done, but on a regular basis.
To learn more about advanced analytics, check out this white paper.
For a deeper dive into this topic, visit our resource center, where you will find a broad selection of content that represents the compiled wisdom, experience, and advice of our seasoned data experts and thought leaders.