Analytics are hot. Many predictions for 2010 indicated that analytics would become more prevalent. I was not the only one to suggest it (check my very own predictions), IDC did too (check my report of the IDC predictions) as well as many others.
In the many webinars and sessions in tradeshows on Predictive Analytics I presented, I could see the attendance steadily increase over the past year as well as the interest in better understanding what it means and the value it brings to organizations.
That being said, everybody means something different when referring to Analytics, making it confusing for most people to understand or to communicate. The key reason is that the family of Analytics is quite extensive. Industry Analysts include in that category a wide variety of technologies ranging from Business Intelligence (BI) to Data Mining or Predictive Analytics.
I attended Gartner’s BPM Summit in Las Vegas a couple of weeks ago — a great show by the way. I was happy to see a large crowd at the Analytics sessions led by Bill Gassman. Coming from the Decision Management angle, I was a little surprised that he saw Business Intelligence as being the category that includes it all, BI of course but also Predictive Analytics and all of Decision Management as he tweeted back to me. He may be right… Not the way I saw it before for sure. In the BPM sessions they also claim that BPM includes it all.. I can live with any definition for sure but this has been a major distraction for me during his talks. I assume this is the same for many others.
I’ll take on the challenge of defining some core concepts. This is not meant to be a comprehensive definition of each discipline though, that would be a much much longer article. Here, I will focus on pointers for applying those techniques to real-life problems. I’ll go into more detail in subsequent posts. The caveat though is that there is no unique definition, and I must say again that none of the many definitions are right or wrong. The objective is merely to enable collaboration by transparency. If we can communicate with a single lexicon or at least agreement on term then we can make progress as an industry.
Why do I care about taxonomy? I reassure you that I am very pragmatic 99% of the time but the truth is that confusion — although helping a company or two fake a new trend — in reality only slows down adoption. When you are not sure if you need a car or a bike because you do not understand the difference, you may most likely wait as long as the need is not burning… IDC made a similar observation in an article last week: 1/3 of U.S. companies do not understand the ROI for Analytics. The interesting part of the article is that the confusion does not come from whether or not the results are here, those companies do not even know how to measure results! I venture that getting a better baseline understanding of the capabilities might help plot expectations and foster a performance-driven culture.
First, Business Intelligence
BI is the discipline of dashboarding past or current information. BI is traditionally used to support a manual decision making process or to spread, monitor and react to changes in trends. BI has been nicknamed in the past the “rear view mirror”. Although it does not sound too glamorous compared to predictive analytics, it is quite essential for your safety to have such device in your car. Have you ever tried to drive without one? Granted you can, but the consequences may not be pleasant. Lots of unexpected “stuff” coming from behind that surprise you at best or cause a major accident some times fatal at worst.
You might think… What is past is something I have seen already so how good is past information? Well, the brain is such that you may notice one or two oddities on your way, as you do business, but the interesting thing is to look at trends in aggregate. If you have a team of people underwriting business or processing claims, they may not see the shifting market trends. Having a nice dashboard for the right decision-makers is invaluable information. Too often decisions are made in the dark, now with data on your hand, you can safely react to those trends.
- When revenue is going down, you want to know how various regions are doing so that you can engage with more marketing in the declining regions or double-down on the booming regions — your ultimate decision of course but not a guess, and EDUCATED decision
- When marketshares are going down, you may want to know how you compare to your competition, how each product line is doing comparatively in order to balance your investment
- When volume of claims is spiking, you may want to understand whether there is fraud involved which you would like to stop right away, or new patterns of behavior that needs to be taken into consideration in your product definition — more accidents per driver, or maybe only for some types of drivers
The idea is to monitor some characteristics over time that will confirm what you know about the business or that will uncover some new trends to stop or to exploit. A very valuable tool for sure. I remember building dashboards 15 years ago for very different purpose such as the Department of Economy in France or Highway Management, tracking traffic to adjust pricing, safety, etc. The art of dashboarding is really in defining what to monitor. This is the unfortunate part… You need to know what you are looking for and monitor it.
This is where I expect a lot of people to react passionately, voicing that BI is more than dashboarding but if you read Wikipedia or other sources, you’ll soon discover that every single Analytics flavor is seen as including all of those.
Business Activity Monitoring
In my view BAM is nothing more than BI applied to Business Process Management (BPM). Although BI often comes with automatic delivery of the reports on your desk on a regular basis, BAM really stresses the dynamic aspect to the extreme. BAM is really about monitoring those Key Performance Indicators (KPIs) and alerting users of any exceptions.
- When rate of Accept versus Reject gets out of expected bounds by over X% then let the Head of Underwriting know — could be fraud, could be an error in decisioning logic, could be a shift in market, could be a strategic change…
- When too many cases are stuck in a given step in the process, let the manager know so that he can dissolve the bottleneck
BAM is the ally of any BPM practitioner, the only way to instill a performance-driven culture, to institutionalize performance measurements.
There is yet another flavor of BI which is the application of this technique to web clicks. Instead of focusing on business transactions, you can track the activity of visitors to the web site to determine what interests them, what leads to more time spent on the site. There is a fine line though between time spent searching and not finding the information you are looking for and time spent exploring topics beyond what you were looking for in the first place. Web Analytics can help you understand what is the pattern of usage and enhance your web site to drive more traffic and as a result more business.
An interesting usage of analytics there is what is called A-B testing in the industry. You may hear of other terms like Champion-Challenger, multivariate testing or experimental design. This technique is not only applicable to web sites but widely used in that context. Google analytics tools offer you some optimization capabilities that do just that: track the performance of alternative pages or portion of pages.
In summary the idea is to offer randomly 2 or more alternatives to your visitors and track with population is doing better. Given that your sampling is random, with enough visitors you should have statistically sound comparison of the performance of each option. Then you can keep the best one and do another test on another portion of your site that you want to improve.
- For 80% of visitors, display call for action: register for webinar
- For 10% of visitors, display call for action: download white paper
- For 10% of visitors, display call for action: sign up for community
Here is another practical example:
- 1/3 of ads will have a thick red outline
- 1/3 of ads will be blinking
- 1/3 of ads will display a classy picture
This is a very powerful mechanism to determine via experiment which style or action will yield the best performance. Retail has been using this technique widely for years. You can obviously use it on boxes of cereals exactly the same way you do it on the web! Financial Services have used champion-challenger a lot but more often in the decisioning itself rather than the website presentation. That being said they keep using this method as well in their paper mailers to you.
This type of analytics varies quite a bit from the other kids we have reviewed so far. Predictive analytics do not aim at publishing past performance in summary or aggregate, it leverages the available data to predictive specific target behavior. For example:
- What is the likelihood of fraud for a given individual?
- What is his/her propensity to accept and offer?
- What is his/her likelihood to have accidents?
- How likely would he/she turn delinquent within 6 or 12 months?
- What is the probability that this customer will become GOLD?
The list goes on and on of what you can predict. It sounds wonderful to predict the future. Haven’t all kids dreamed of predicting anything? There is no magic here, mostly math. But it comes with some caveats of course. You can only do a good job predicting behaviors that you have seen in the past: you must have data, lots of data — historical or fabricated but you will only find what you fabricated in that one. You need to know the outcome you are trying to predict for all past transactions you are analyzing. You must be confident that past performance is a reliable indicator of future performance.
Would a foreclosure score manufactured before the 2008 recession be still accurate now? Well, given that some people game foreclosure now when the value of the house is way below the mortgage they contracted, and those scores did not take this new behavior into account, I would opt for a new score if I was looking for accuracy.
The beauty of predictive scores is that you can pull a number of characteristics and let them assess the likelihood of the target behavior without knowing in advance what the secret sauce might be. The tooling will guide you through the assembly of the mathematical expression that calculates the score — although before using the tools you had no idea that a credit score is good indicator of the propensity to have accident. Who would have known?
Of course, once the scoring formula is created, the predictor set won’t change. If that is what you are looking for then you should turn to self-learning and/or adaptive systems.
The score computed for each customer can then be used in the context of business rules in an automated system:
- If the risk is low, then accept the mortgage
- If the customer is likely to churn, then offer free services for 24-month commitment
- If activity is likely to be fraudulent, then route the transaction to a case worker
The score by itself does not do any good but when it is put in the context of a decisioning service it helps increase the precision of your strategies. You can pinpoint who to offer red-carpet treatment to (and is likely to respond well as in future $$$) and who to let go as a customer. Like BI you want to make better educated decisions but with predictive analytics you can shortcut the path from insight to operationalization. No human needs to be involved in reviewing a dashboard or a score, a decisioning service might do it for you — as much as you allow it. Often underwriters target a 80% automation rate, knowing that some applications will always be borderline or exceptions, requiring experts to look into them.
Well, Data Mining is really the middle ground between BI and Predictive Analytics. It goes beyond the automated generation of reports but it does not go as far as creating a score.
Data Mining is leveraged by Predictive Analytics in fact in the sense that it helps uncover relationships that you did not necessarily know of. Using some clustering techniques for example, you can determine affinity between characteristics. The most famous example that we hear about all the time is for beer and diaper sales late at night. It may initially sound strange when you see the correlation but when you think about it… Late at night, when someone buys diapers, it is more likely that the buyer is Dad rather than Mom as she is likely taking care of the little one. Dads alone in the supermarket statistically often take advantage of the time to purchase a pack of beer. Hence a strong correlation between diapers and beer for purchases happening late in the evening.
Visualization tools and sophisticated algorithms in the hands of a modeling team can do wonders to uncover those sometimes strange behavior. The information is golden for the companies as they make increase their sales by marketing those items via coupons or fidelity programs. If you buy diapers late at night but have never purchased beer yet, we can give you a coupon on beer…
You can argue that Data Mining is a segmentation of Business Intelligence and that is fine. I want to point out it exists though and can be extremely valuable.
Other practical examples of how to leverage data mining pop up everywhere on retail sites like amazon.com:
- Frequently Bought Together…
- What Do Customers Ultimately Buy After Viewing This Item?
- Customers Who Bought … Also Bought…
- Customers Who Bought Items in Your Recent History Also Bought…
Not only are recommendation useful for the merchant but they can also become a valuable service for the end-user.
I could go on and on but I hope that this first set of definitions help diffuse some of the confusion around the big word. I will cover more details on a slight variant called real-time BI in a separate post as this one is already getting too long!
Learn more about Decision Management and Sparkling Logic’s SMARTS™ Data-Powered Decision Manager