Intellifest 2012: Charles Forgy: Where does the time go?

on October 25, 2012
SMARTS™ Data-Powered Decision Management Platform

Charles Forgy delivered a very interesting presentation where he analyzed in detail where the actual performance challenges are in RETE implementations. Charles is of course very well known as the inventor of Rete, the algorithm implemented by the majority of the forward chaining rules engines in the planet. For full disclosure, Charles is also part of Sparkling Logic.

Rete is a fairly intuitive algorithm, and Carole-Ann provided an excellent explanation on how it actually works in this blog here and here. Charles walked us with the relevant mechanisms through a carefully selected example.

The major operations the algorithm performs are as follows in a cyclical way:

  • alpha test
  • beta activation
  • beta test
  • conflict set activation
  • rules firing

The key feature of the way RETE works is that at every cycle most of the objects do not change, and only a few do. You can get significant gains by optimizing the way you handle the object changes between cycle N and cycle N+1.

Charles spent some time discussing WaltzDB. Using OpsJ disabling the optimizations for the engine, Charles introspected where the engine spent its time. Not surprisingly, beta testing and conflict set activations represented together over 90% of the number of operations.

However, 48% of the operations in conflict set activation is unusual. Analyzing the rule base, Charles identified the two rules responsible for vast majority of those conflict set activation operations: essentially those who deal with large junctions and in fact waste most of the work they do. Charles suggests a change to WaltzDB to fix that – a simple additional condition which will reduce the conflict set activation numbers drastically.

To benchmark the engine, Charles analyzed a number of approaches. In particular, he studied statistical sampling based approaches, for which the implementations are sometimes fraught with issues. For example, it is known that JVMTI based profilers produce misleading results.

Safer profilers include:

  • Oracle JRockit Mission Control
  • AMD CodeAnalyst (hardware level, AMD)
  • Intel VTune (hardware level, Intel – but with no Java support in the latest version)

Charles used Oracle JRockit and AMD CodeAnalyst.

For WaltzDB2, the results:

  • alpha 1%
  • beta activaton 39%
  • beta test 57%
  • conflict set activation< 1%
  • rule firing 1%

WaltzDB2 makes a lot of beta testing, Charles’ opinion is that a more realistic benchmark would have more beta activation and less beta testing.

The hardware level analysis yielded additional results. For example, the beta loop code accounts for over 20% of the time in the matcher, 0.4 instructions per clock cycle. That is a low number in modern processors – usually you would see 2 instructions per clock cycle in performant code.

The code there looks like you would expect

while (ptr != null) {

if (ptr.hash == hashval) {


ptr = ptr.next;


Most of the latency is actually spent accessing the hash value for the pointer (ptr.hash). The culprit is the fact that at that point we do not hit the cache. The price of missing the cache is huge. Typically, for a level 2 cache miss costs ~300 cycles. When the cache is not missed, multiple instructions per cycle are possible. For example, accessing ptr.next in the code above tends to hit the cache and be essentially free.

Working in a byte-code language makes controlling this situation a little bit more complex, let alone doing it at the level of rules.

Learn more about Decision Management and Sparkling Logic’s SMARTS™ Data-Powered Decision Manager

Search Posts by Category


Sparkling Logic Inc. is a Silicon Valley-based company dedicated to helping organizations automate and optimize key decisions in daily business operations and customer interactions in a low-code, no-code environment. Our core product, SMARTS™ Data-Powered Decision Manager, is an all-in-one decision management platform designed for business analysts to quickly automate and continuously optimize complex operational decisions. Learn more by requesting a live demo or free trial today.