With the world on a partial lockdown due to COVID 19, we had to be creative. DecisionCAMP 2020 takes place virtually this year, through Zoom presentations and Slack interactions. The show invited me to present ahead of the event.
Watch my DecisionCAMP 2020 presentation now
I decided to tackle one of the most common rules designs. Though I hope that you will implement it in SMARTS, it is technology-agnostic. As such, you could adopt it regardless of the decision management system that you use.
A decision management system obviously makes decisions. These decisions can boil down to a yes or no answer. In many circumstances, the decision includes several sub-components, in addition to that primary decision. For this design pattern, however, I only focus on the primary decision. Note that you could use the same design applied to any sub-decision as well. This is a limitation of the presentation, not one of the design.
In an underwriting system, for example, the final approval derives from many different data-points. The system looks at self-disclosures regarding the driver(s) and the vehicle(s), but also third-party data sources like DMV reports. If the rules make that decision as they go through the available data, there is a risk of an inadvertent decision override. Hence the need for a design pattern that collects these point decisions, or intermediate decisions, and makes the final decision in the end. In this presentation, I illustrate how do it in a simple and effective manner.
Let’s take on another challenge from the Decision Management community. The Dynamic Loan Evaluation challenge looks very applicable to what our customers do. In this scenario, the business logic is very simple, but data is uncovered over time. As a result, the underwriting decision changes each time new information comes to light. Overall, this project illustrates the combination of a point decision, the loan origination, and a changing series of facts.
Click here to watch the demo
Key takeaways from this dynamic loan evaluation challenge
The business rules behind this loan evaluation do not present much difficulties. While a real system typically includes many eligibility criteria, this evaluation relies solely on a risk exposure measurement. If the assets exceed the obligations, we approve the loan. However, if the obligations exceed the assets, we decline the loan.
As assets and debt gets uncovered, the balance tips from one side to the other. While this feels like a dynamic interaction, I prefer to design loan origination systems as stateless services. This means that you expose all the information available, and the business rules render the verdict. In short, the decision service does not keep track of the history.
On the other hand, the origination system must handle the dynamic aspect of the loan. In this challenge, I use a dynamic questionnaire to capture the data. The dynamic questionnaire collects all borrower and guarantor information over time. As the loan agent, I append the new facts to the in-flight application, and submit the whole thing for evaluation.
I like to separate cleanly the business logic from the questionnaire logic. This project illustrates this design perfectly.
Click here to watch the demo
If you want to try building the demo by yourself, feel free to ask for a free evaluation.
Our friend Jacob posted a Decision Management challenge this month. SaaS pricing can prove to be a challenge to calculate when combining volume discount and special incentive. In this challenge, I demonstrate how to take advantage of test cases to safely write these pricing rules. Click here to watch the demo
Key takeaways from this challenge
As I hinted, testing against expected outcome is saving a ton of time during rules writing. While I ran into several typos in my data entry, I felt comfortable that my rules were correct (after correcting the data entry).
Seeing is believing. Being able to see what the rules assign to each tier allows for a quick understanding of what was left to do.
Finally, not seeing calculations in place would have likely taken me a lot more time (and a headache) to complete that challenge. It helps a great deal to rely on intermediate calculations when rules end up somewhat convoluted.
While this is a simple pricing use case, business rules can quickly create a combinatorial explosion of scenarios. Decompose your problem in a few simple steps, check that your progress, until you cover all of your test cases.
1. what tier do you start in
2. move extra units to the next tier
3. apply special incentive if applicable
If you want to try building the demo by yourself, feel free to ask for a free evaluation.
One of the subjects that frequently comes up when considering decision engines is performance, and, more broadly, the performance characterization of decisions: how do decision engines cope with high throughput, low response, high concurrency scenarios?
In fact, the whole world of forward-chaining business rules engines (based on the RETE algorithm) was able to develop largely because of the capability of that algorithm to cope with some of the scalability challenges that were seen. If you want to know more about the principles of RETE, see here.
However, there is significantly more to decision engine performance than the RETE algorithm.
A modern decision engine, such as SMARTS;, will be exercised in a variety of modes:
- Interactive invocations where the caller sends data or documents to be decided on, and waits for the decision results
- Batch invocations where the caller streams large data or sets of documents through the decision engine to get the results
- Simulation invocations where the caller streams large data and sets of documents through the decision to get both the decision results and decision analytics computations made on them
Let’s first look at performance as purely execution performance at the decision engine level.
The SMARTS; decision engine allows business users to implement decisions using a combination of decision logic representations:
- Decision flows
- Rules groups in rule sets
- Lookup models
- Predictive models
These different representations provide different perspectives of the logic, and the most optimal representation for implementing, reviewing, and optimizing the decision. For example, SMARTS allows you to cleanly separate the flow of control within your decision from your smaller decision steps – check the document data is valid first, then apply knock out rules, then etc.
However, SMARTS does also something special for these representations: it executes them with dedicated engines tuned for high performance for their specific task. For example, if you were to implement a predictive model on top of a rules engine, your result will typically be sub-par. However, in SMARTS, each predictive model is executed by a dedicated and optimized execution engine.
Focusing on what is traditionally called business rules, SMARTS provides:
- A compiled sequential rules engine
- A compiled Rete-NT inference rules engine
- A fully indexed lookup model engine
These are different engines, and apply to different use cases, and they are optimized for their specific application.
Compiled Sequential Rules Engine
This engine simply takes the rules in the rule set, orders them by explicit or implicit priority, and evaluates the rule guards and premises in the resulting order. Once a rule has its guard and premise evaluated to true, it fires. If the rule set is exclusive, the rule set evaluation is over, and if not, the next rule in the ordered list is evaluated.
There is a little bit more than that to it, but that’s the gist.
The important point is that there is no interpreter involved – this is executed in code compiled to the bytecode of the chosen architecture (Java or .NET or .NET Core). So, this executes at bytecode speed and gets optimized by the same JITs as any code in the same architecture.
This yields great performance when the number of transactions is very large, and the average number of rules evaluated (i.e. having their guards and premises evaluated) is not very large. For example, we’ve implemented batch fraud systems processing over 1 billion records for fraud in less than 45 minutes on 4 core laptops.
When the number of rules becomes very large, in the 100K+ in a single rule set, then the cost of evaluating premises that do not match starts getting too high. Our experience is that is very likely that with that number of rules your problem is in fact a lookup problem and would be better served by a lookup model(As an aside, lookup models also provide superior management for large numbers of rules). If that is not the case, then a Rete-NT execution would be better.
Compiled Rete-NT Inference Rules Engine
This engine takes the rules within the rule set and converts them to a network following the approach described in this blog post. What this approach does is revert the paradigm – in RETE, the data is propagated through the network, and the rules ready to fire are more optimally found and put in an agenda. The highest priority rule in the agenda is retrieved, and executed. The corresponding changes get propagated into the network, the agenda updated, etc., until there is nothing left in the agenda.
One important distinction with respect to the sequential engine is that in the case of the Rete-NT engine, a rule that already fired may well be put back in the agenda. This capability is sometimes required by the problem being solved.
Again, there is much more to it, but this is the principle.
SMARTS implements the Rete-NT algorithm – which is the latest optimized version of the algorithm provided by its inventor, Charles Forgy, who serves on the Sparking Logic Advisory Board. RETE-NT has been benchmarked to be between 10 and 100 times faster than previous versions of the algorithm in inference engine tests. In addition, SMARTS compiles to the bytecode of the chosen architecture everything that is not purely the algorithm, allowing all expressions to be evaluated at bytecode speed.
In the case where your number of rules per rule set is very large, in the 100K+ range, and you are not dealing with a lookup model, the RETE-NT engine yields increasingly better performance compared to the sequential engine. SMARTS has been used with 200k+ rules in rule sets – these rules end up exercising 100s of fields in your object model, and the Rete-NT advantages make these rule sets perform better than the pure sequential engine.
Fully Indexed Lookup Model Engine
There are numerous cases where what a rule set with a large number of rules is doing is selecting out of a large number of possibilities. For example, going through a list of medication codes to find those matching parametrized conditions.
In many systems, that is done outside a decision engine, but in some cases, it makes sense to make it part of the decision. For example, when it is managed by the same people and at the same time, it is intimately tied to the rest of the decision logic, and it needs to go through its lifecycle exactly like the rules.
SMARTS provides a Lookup Model specifically for those cases: you specify the various possibilities as data, and then a query that is used to filter from the data the subset that matches the parameters being used. At a simplified level, this engine works by essentially doing what database engines do: indexing the data as much as possible and convert the query into a search through indexes. Because the search is fully indexed, scalability with the number of possibilities is great, and superior to what you can get with the Sequential or Rete-NT engine.
The SMARTS decision engine can also run concurrent branches of the decision logic, implemented in any combination of the engines above. That allows a single invocation to the decision engine to execute on more than 1 core of your system. There are heavy computation cases in which such an approach yields performance benefits.
Of course, performance is also impacted by deployment architecture choices. We’ll explore that topic in the next post in this series.
Integration with data is key to a successful decision application: Decision Management Systems (DMS) benefit from leveraging data to develop, test and optimize high value decisions.
This blog post focuses on the usage of data by the DMS for the development, testing and optimization of automated decisions.
Read More »
A key benefit of using a Decision Management System is to allow the life-cycle of automated decisions to be fully managed by the enterprise.
When the decision logic remains in the application code, it becomes difficult to separate access to decision logic code from the rest. For example, reading through pages of commit comments to find the ones relevant to the decision is close to impossible. And so is ensuring that only resources with the right roles can modify the logic.
Clearly, this leads to the same situation you would be in if your business data were totally immersed in the application code. You would not do that for your business data, you should not do that for your business decision logic for exactly the same reasons.
Decision Management Systems separate the decision logic from the rest of the code. Thus, you get the immense benefit of being able to update the decision logic according to the business needs. But the real benefit comes when you combine that with authentication and access control:
- you can control who has access to what decision logic asset, and for what purpose
- and you can trace who did what to which asset, when and why
Of course, a lot of what is written here applies to other systems than Decision Management Systems. But this is particularly important in this case.
Roles and access control
The very first thing to consider is how to control who has access to what in the DMS. This is access control — but note that we also use authorization as an equivalent term.
In general, one thinks of access control in terms of roles ans assets. Roles characterize how a person interacts with the assets in the system.
And the challenge is that there are many roles involved in interacting with your automated decision logic. The same physical person may fill many roles, but those are different roles: they use the decision management system in different ways. In other words, these different roles have access to different operations on different sets of decision logic assets.
Base roles and access control needs
Typically, and this is of course not the only way of splitting them, you will have roles such as the following:
The administrator role administers the system but rarely is involved in anything else. In general, IT or operations resources are those with this role.
- Decision definer
The decision definer role is a main user role: this role is responsible for managing the requirements for the automated decision and its expected business performance. Typically, business owners and business analysts are assigned this role.
- Decision implementer
The decision implementer role is the other main user role: this role designs, implements, tests and optimizes decisions. Generally, business analysts, data analysts or scientists, decision owners, and sometimes business-savvy IT resources are given this role.
- Decision tester
The decision tester role is involved in business testing of the decisions: validating they really do fit what the business needs. Usually, business analysts, data analysts and business owners fill this role.
- Life-cycle manager
The life-cycle manager role is responsible for ensuring that enterprise-compliant processes are followed as the decision logic assets go from requirements to implementation to deployment and retirement.
More advanced needs
There may be many other roles, and the key is to realize that how the enterprise does business impacts what these roles may be. For example, our company has a number of enterprise customers who have two types of decision implementer roles:
- General decision implementer: designs, implements the structure of the decision and many parts of it, tests and optimizes it
- Restricted decision implementer: designs and implements only parts of the decision — groups of rules, or models
The details on what the second role can design and implement may vary from project to project, etc.
Many other such roles may be defined: those who can modify anything but the contract between the automated decision and the application that invokes, etc.
It gets more complicated: you may also need to account for the fact that only specific roles can manage certain specific assets. For example, you may have a decision that incorporates a rate computation table that only a few resources can see, although it is part of what the system manages and executes.
Requirements for the Decision Management System
Given all this, the expectation is that the DMS support directly, or through an integration with the enterprise systems, the following:
- Role-based access control to the decision logic asset
- And ability to define custom roles to fit the needs of the enterprise and how it conducts its business
- And ability to have roles that control access to specific operations on specific decision logic assets
This can be achieved in a few ways. In general:
- If all decision assets are in a system which is also managed by the enterprise authentication and access control system: you can directly leverage it
- And if that is not the case: you delegate authentication and basic access control to the enterprise authentication and access control system, and manage the finer-grained access control in the DMS, tied to the external authentication
Of course, roles are attached to a user, and in order to guarantee that the user is the right one, you will be using an authentication system. There is a vast number of such systems in the enterprise, and they play a central role in securing the assets the enterprise deals with.
The principle is that for each user that needs to have access to your enterprise systems, you will have an entry in your authentication system. Thus, the authentication system will ensure the user is who the user claims, and apply all the policies the enterprise wants to apply: two-factor authentication, challenges, password changes, etc. Furthermore, it will also control when the user has access to the systems.
This means that all systems need to make sure a central system carries out all authentications. And this includes the Decision Management System, of course. For example:
- The DMS is only accessible through another application that does the proper authentication
- Or it delegates the authentication to the enterprise authentication system
The second approach is more common in a services world with low coupling.
Requirements for the Decision Management System
The expectation is that the DMS will:
- Delegate its authentication to the enterprise authentication and access control systems
- Or use the authentication information provided by an encapsulating service
Vendors in this space have the challenge that in the enterprise world there are many authentication systems, each with potentially more than one protocol. Just in terms of protocols, enterprises use:
- OpenID Connect
- and more
Additionally, enterprises are interested in keeping a close trace of who does what and when in the Decision Management System. Of course, using authentication and the fact that users will always operate within the context of an authenticated session largely enables them to do so.
But this is not just a question of change log: you also want to know who has been active, who has exported and imported assets, who has generated reports, who has triggered long simulations, etc.
Furthermore, there are three types of usages for these traces:
- Situational awareness: you want to know what has been done recently and why
- Exception handling: you want to be alerted if a certain role or user carries out a certain operation. For example, when somebody updates a decision in production.
- Forensics: you are looking for a particular set of operations and want to know when, who and why. For example, for compliance verification reasons.
A persisted and query-able activity stream provides support for the first type of usage. And an integration with the enterprise log management and communication management systems support the other types of usages.
Requirements for the Decision Management System
The expectation is that the DMS will:
- Provide an activity stream users can browse through and query
- And support an integration with the enterprise systems that log activity
- And provide an integration with the enterprise systems that communicate alerts
There are many more details related to these authentication, access control and trace integrations. Also, one interesting trend is the move towards taking all of these into account for the beginning as the IT infrastructure moves to the models common in the cloud, even when on-premise.
This blog is part of the Technical Series, stay tuned for more![Image Designed by security from Flaticon]
Decision Management and Business Rules Management platforms cater to the needs of business oriented roles (business analysts, business owners, etc.) involved in operational decisions. But they also need to take into account the constraints of the enterprise and its technology environment.
Among those constraints are the ones that involve integrations. This is the first series of posts exploring the requirements, approaches and trade-offs for decision management platform integrations with the enterprise eco-system.
Operational decisions do not exist in a vacuum. They
- are embedded in other systems, applications or business processes
- provide operational decisions that other systems carry out
- are core contributors to the business performance of automated systems
- are critical contributors to the business operations and must be under tight control
- must remain compliant, traced and observed
- yet must remain flexible for business-oriented roles to make frequent changes to them
Each and every one of these aspects involves more than just the decision management platform. Furthermore, more than one enterprise system provides across-application support for these. Enterprises want to use such systems because they reduce the cost and risk involved in managing applications.
For example, authentication across multiple applications is generally centralized to allow for a single point of control on who has access to them. Otherwise, each application implements its own and managing costs and risk skyrocket.
In particular, decision management platforms end up being a core part of the enterprise applications, frequently as core as databases. It may be easy and acceptable to use disconnected tools to generate reports, or write documents; but it rarely is acceptable to not manage part of core systems. In effect, there is little point in offering capabilities which cannot cleanly fit into the management processes for the enterprise; the gain made by giving business roles control of the logic is negated by the cost and risk in operating the platform.
In our customer base, most do pay attention to integrations. Which integrations are involved, and with which intensity, depends on the customer. However, it is important to realize that the success of a decision management platform for an enterprise also hinges on the quality of its integrations to its systems.
Which integrations matter?
We can group the usual integrations for decision management platforms in the following groups:
- Authentication and Access Control
- Implementation Support
- Management Audit
- Life-cycle management
- Execution Audit
- Business Performance Tracking
Authentication and access control integrations are about managing which user has access to the platform, and, beyond that, to which functionality within the platform.
Implementation support integrations are those that facilitate the identification, implementation, testing and optimization of decisions within the platform: import/export, access to data, etc.
Management audit integrations enable enterprise systems to track who has carried out which operations and when within the platform.
Life-cycle management integrations are those that support the automated or manual transitioning of decisions through their cycles: from inception to implementation and up to production and retirement.
Similarly, execution integrations enable the deployment of executable decisions within the context of the enterprise operational systems: business process platforms, micro-services platforms, event systems, etc. Frequently, these integrations also involve logging or audit systems.
Finally, performance tracking integrations are about using the enterprise reporting environment to get a business-level view of how well the decisions perform.
Typically, different types of integrations interest different roles within the enterprise. The security and risk management groups will worry about authentication, access control and audit. The IT organization will pay attention to life-cycle management and execution. Business groups will mostly focus on implementation support and performance tracking.
The upcoming series of blog posts will focus on these various integrations: their requirements, their scope, their challenges and how to approach them.
In the meantime, you can read the relevant posts in the “Best Practices” series:
A little while ago, I ran into a question in Quora that hit me in the stomach… figuratively, of course. Someone asked “why do rules engines fail to gain mass adoption?“. I had mixed feelings about it. In one hand, I am very proud of our decision management industry, and how robust and sophisticated our rules engines have become. In the other hand, I must admit that I see tons of projects not using this technology that would help them so much. I took a little time to reflect on the actual roadblocks to rules engines.
A couple of points I want to stress first
In addition to the points I make below, with a little more time to think about it, I think it boils down to evangelization. We, in the industry, have not been doing a good job educating the masses about the value of the technology, and its ease of use. We rarely get visibility up the CxO level. Business rules is never one of the top 10 challenges of executives, though it might be in disguise. We need to do a better job.
I’m signing up, with my colleagues, for an active webinar series, so that we can address one of the roadblocks to rules engines, and decision management!
Rules are so important, they are already part of platforms
The other key aspect to keep in mind is that business rules are so important in systems that they often become a de-facto component of the ecosystem. Business rules might be used under the form of BPM rules or other customization, but not called out as a usage for rules engines. Many platforms will claim they include business rules. The capability might be there, though it may not be as rich as a decision management system. Many vertical platforms like Equifax’s InterConnect platform include a full-blown decision management system though. When decision makers have to allocate the budget for technology, this becomes another of the roadblocks to rules engines, as they assume that rules are covered by the platform. Sometimes they are right, often not.
Rules in code is not a good idea
Let me stress once more that burying your rules into code or SQL procedures is not a good idea. It is one of the roadblocks to rules engines excuse we have heard probably the most. I explain down below that this is tempting for software developers to go back to their comfort zone. This is not sustainable. This is not flexible. We did that many decades ago as part of Cobol system, mostly because decision management systems did not exist back then. We suffered with maintenance of these beasts. In many occurrences, the maintenance was so painful that we had to patch the logic with pre-processing or post-processing to avoid touching the code. We have learned from these days that logic, when it is complex and/or when it changes decently often, needs to be externalized.
Business owners do not want to go to IT and submit a change request. They want to be able to see the impact of a change before they actually commit to the change. They want the agility to tweak their thresholds or rate tables within minutes or hours, not days and weeks. While there is testing needed for rules like for any software, it is much more straight-forward as it does not impact the code. It is just about QA testing and business testing.
Here is my answer:
I have been wondering the same thing. Several decades ago, I discovered expert systems at school, in my AI class. I fell in love with them, and even more so with rules engines as they were emerging as commercial products.
While coding is more powerful and intuitive than it used to be, the need is still there to make applications more agile. Having software developers change code is certainly more painful than changing business logic in a separate component, ie the decision service.
Some argue that the technology is difficult:
- ability to find issues
Is the technology too difficult to use?
Because of syntax
I can attest that writing LISP back in the days was nothing ‘intuitive’. Since then, thanks God, rules syntax has improved, as programming syntax did too. Most business analysts I have worked with have found the syntax decently understandable (except for the rare rules engine that still use remnants of OPS5). With a little practice, the syntax is easily mastered.
Furthermore, advances in rules management have empowered rules writers with additional graphical representations like decision tables, trees, and graph. At Sparkling Logic, we went a step further to display the rules as an overlay on top of transaction data. This is as intuitive as it gets in my opinion.
Because of debugging
The second point seems more realistic. When rules execute, they do not follow a traditional programmatic sequence. The rules that apply simply fire. Without tooling, you might have to have faith that the result will be correct. Or rely on a lot of test case data! Once again, technology has progressed and tooling is now available to indicate which rules fired for a given transaction, what path was taken (in the form of an execution trace), etc. For the savvy business analyst, understanding why rules did or did not execute has become a simplistic puzzle game… You just have to follow the crumbs.
So why are rules not as prevalent as they should be?
Is the technology too easy to not use?
Because of ownership
I am afraid to say that IT might be responsible. While it is now a no-brainer to delegate to a database for storage, and for some other commonly accepted components in the architecture for specialized functions, it remains a dilemma for developers to let business analysts take ownership. You need to keep in mind that business rules are typically in charge of the core of the system: the decisions. If bad decisions are made, or if no decision can be made, some heads will roll.
Because of budget
Even if the management of decisions is somewhat painful, and inflexible, it is a common choice to keep these cherished rules inside the code, or externalized in a database or configuration file. The fact that developers have multiple options to encode these rules is certainly not helping. First, they can see it as their moment of fun, for those that love to create their own framework (and there are plenty of them). Second, it does not create urgency for management to allocate a budget. It ends up being a build vs. buy decision.
Without industry analysts focused exclusively on decision management, less coverage by publications, and less advertisement by the tech giants, the evangelization of the technology is certainly slower than it deserves to be.
Yet, it should be used more…
Because of Decision Analytics
I would stress that the technology deserves to be known, and used. Besides agility and flexibility (key benefits of the technology), there is more that companies could benefit from. In particular, decision analytics are on top of my list. Writing and executing rules is clearly an important aspect. But I believe that measuring the business performance is also critical. With business rules, it is very easy to create a dashboard for your business metrics. You can estimate how good your rules are before you deploy them, and you can monitor how well they perform on a day-to-day basis.
Because of ease of integration
For architects, there are additional benefits too in terms of integration. You certainly do not want to rewrite rules for each system that needs to access them. Rules engines deployed as a component can be integrated with the online and batch system, and any changing architecture, without any rewrite, duplication, or any nightmare.
With that in mind, I hope that you will not let these roadblocks to rules engines stop you. There are plenty of reasons to consider decision management systems, or rules engines as they are often called. You will benefit greatly:
- Flexibility to change your decision logic as often and as quickly as desired
- Integration and deployment of predictive analytics
- Testing from a QA and business perspective
- Measure business performance in sand-box and in production
- and yet, it will integrate beautifully in your infrastructure