With the growing ability by business and consumers to create and interact with data, we are seeing exploding interest in analytics, including the more technical aspects of it, such as predictive analytics.
We are seeing organizations provide APIs enabling access to data to others, so that interesting work can be done on them, sometimes mashing them in new innovative ways.
While a lot of the work has focused around visualizing data in ways that a human who is a specialist of the domain can make sense of it, some work has gone into the creation of analytical models – models that extract from the data the information pertaining to a particular problem and allow projections of behavior in the future or new cases.
Creating models, though, has by and large remained a difficult task. Numerous issues related to data quality, variable selection, model type selection, model quality, model stability, etc, have to be dealt with – and the actual problem specialists face a bewildering array of techniques and compromises to address them. Typically, managing these models ends up involving problem specialists, analytics specialists, data specialists and software specialists, each one of them with their concepts, vocabulary, processes, tools… No surprise that it takes sometimes months to simply update existing models, and a long time to develop them.
Leveraging technology that caters to the needs of all these groups in a consistent and coordinated way should help ease out the culture and process issues.
The traditional tools used for model building, such as those provided by SAS, IBM/SPSS or the OSS community (project R), cater to the analytics specialist, somewhat to the data specialist – but they do not really help the problem specialist. Even the highly graphical partially point-and-click environments they provide, such as SAS Enterprise Miner, do not really help the problem specialist with the creation and management of models – too much complexity, too many choices, too much to understand about the fine details of each aspect of the model type chosen and its interplay with the variables and data.
Others, such as KXEN, have sought to remove a lot of the complexity by automating as much as possible of the modeling process, leveraging essentially one kind of approach (of course, it’s more complicated than that, but that’s the gist of it).
Microsoft Excel provides a simple metaphor for manipulating data which is commonly understood and accessible to problem specialists. But it lacks the hard core support for model management that the tools above provide.
SAS just announced SAS Rapid Modeler which represents a new trend that the analytics companies are taking, similar to what the expert system AI companies took in the late ‘90s with their introduction of Business Rules Management.
With this tool, the problem specialist can
- Ask the tool to identify variables and create many different models, with no need to understand the details of the technology being used
- View how well the models predict
- From an environment they understand well (Office)
- While still generating models that the analytics specialists can further study and modify if needed within an environment that they understand well (SAS Enterprise Miner)
The last point is important: the tool creates in SAS Enterprise Miner model management process flows that the analytics specialist will be able to leverage, improve and extend, and do that in their world.
With tools such as this, the initiative in the creation of the models comes back to the problem specialist without creating friction with the analytics specialist. And that became possible because the tool does not have the user worry about the underlying technology (see what Carole-Ann wrote on this earlier), and because the user interaction is done in an environment that the problem specialists get (the #1 issue in Decision Management).
The promise of SAS Rapid Predictive Modeler is great – we will see how well it stands the test of the market. It is a smart move for SAS, and a good move towards making predictive analytics a more accessible part of Decision Management.