One of the subjects that frequently comes up when considering decision engines is performance, and, more broadly, the performance characterization of decisions: how do decision engines cope with high throughput, low response, high concurrency scenarios?
In fact, the whole world of forward-chaining business rules engines (based on the RETE algorithm) was able to develop largely because of the capability of that algorithm to cope with some of the scalability challenges that were seen. If you want to know more about the principles of RETE, see here.
However, there is significantly more to decision engine performance than the RETE algorithm.
A modern decision engine, such as SMARTS;, will be exercised in a variety of modes:
- Interactive invocations where the caller sends data or documents to be decided on, and waits for the decision results
- Batch invocations where the caller streams large data or sets of documents through the decision engine to get the results
- Simulation invocations where the caller streams large data and sets of documents through the decision to get both the decision results and decision analytics computations made on them
Let’s first look at performance as purely execution performance at the decision engine level.
The SMARTS; decision engine allows business users to implement decisions using a combination of decision logic representations:
- Decision flows
- Rules groups in rule sets
- Lookup models
- Predictive models
These different representations provide different perspectives of the logic, and the most optimal representation for implementing, reviewing, and optimizing the decision. For example, SMARTS allows you to cleanly separate the flow of control within your decision from your smaller decision steps – check the document data is valid first, then apply knock out rules, then etc.
However, SMARTS does also something special for these representations: it executes them with dedicated engines tuned for high performance for their specific task. For example, if you were to implement a predictive model on top of a rules engine, your result will typically be sub-par. However, in SMARTS, each predictive model is executed by a dedicated and optimized execution engine.
Focusing on what is traditionally called business rules, SMARTS provides:
- A compiled sequential rules engine
- A compiled Rete-NT inference rules engine
- A fully indexed lookup model engine
These are different engines, and apply to different use cases, and they are optimized for their specific application.
Compiled Sequential Rules Engine
This engine simply takes the rules in the rule set, orders them by explicit or implicit priority, and evaluates the rule guards and premises in the resulting order. Once a rule has its guard and premise evaluated to true, it fires. If the rule set is exclusive, the rule set evaluation is over, and if not, the next rule in the ordered list is evaluated.
There is a little bit more than that to it, but that’s the gist.
The important point is that there is no interpreter involved – this is executed in code compiled to the bytecode of the chosen architecture (Java or .NET or .NET Core). So, this executes at bytecode speed and gets optimized by the same JITs as any code in the same architecture.
This yields great performance when the number of transactions is very large, and the average number of rules evaluated (i.e. having their guards and premises evaluated) is not very large. For example, we’ve implemented batch fraud systems processing over 1 billion records for fraud in less than 45 minutes on 4 core laptops.
When the number of rules becomes very large, in the 100K+ in a single rule set, then the cost of evaluating premises that do not match starts getting too high. Our experience is that is very likely that with that number of rules your problem is in fact a lookup problem and would be better served by a lookup model(As an aside, lookup models also provide superior management for large numbers of rules). If that is not the case, then a Rete-NT execution would be better.
Compiled Rete-NT Inference Rules Engine
This engine takes the rules within the rule set and converts them to a network following the approach described in this blog post. What this approach does is revert the paradigm – in RETE, the data is propagated through the network, and the rules ready to fire are more optimally found and put in an agenda. The highest priority rule in the agenda is retrieved, and executed. The corresponding changes get propagated into the network, the agenda updated, etc., until there is nothing left in the agenda.
One important distinction with respect to the sequential engine is that in the case of the Rete-NT engine, a rule that already fired may well be put back in the agenda. This capability is sometimes required by the problem being solved.
Again, there is much more to it, but this is the principle.
SMARTS implements the Rete-NT algorithm – which is the latest optimized version of the algorithm provided by its inventor, Charles Forgy, who serves on the Sparking Logic Advisory Board. RETE-NT has been benchmarked to be between 10 and 100 times faster than previous versions of the algorithm in inference engine tests. In addition, SMARTS compiles to the bytecode of the chosen architecture everything that is not purely the algorithm, allowing all expressions to be evaluated at bytecode speed.
In the case where your number of rules per rule set is very large, in the 100K+ range, and you are not dealing with a lookup model, the RETE-NT engine yields increasingly better performance compared to the sequential engine. SMARTS has been used with 200k+ rules in rule sets – these rules end up exercising 100s of fields in your object model, and the Rete-NT advantages make these rule sets perform better than the pure sequential engine.
Fully Indexed Lookup Model Engine
There are numerous cases where what a rule set with a large number of rules is doing is selecting out of a large number of possibilities. For example, going through a list of medication codes to find those matching parametrized conditions.
In many systems, that is done outside a decision engine, but in some cases, it makes sense to make it part of the decision. For example, when it is managed by the same people and at the same time, it is intimately tied to the rest of the decision logic, and it needs to go through its lifecycle exactly like the rules.
SMARTS provides a Lookup Model specifically for those cases: you specify the various possibilities as data, and then a query that is used to filter from the data the subset that matches the parameters being used. At a simplified level, this engine works by essentially doing what database engines do: indexing the data as much as possible and convert the query into a search through indexes. Because the search is fully indexed, scalability with the number of possibilities is great, and superior to what you can get with the Sequential or Rete-NT engine.
The SMARTS decision engine can also run concurrent branches of the decision logic, implemented in any combination of the engines above. That allows a single invocation to the decision engine to execute on more than 1 core of your system. There are heavy computation cases in which such an approach yields performance benefits.
Of course, performance is also impacted by deployment architecture choices. We’ll explore that topic in the next post in this series.