Rules Fest Live: David Holz / When Truth goes Bad
David shared his extensive experience with building rules-based systems, starting from scratch building an engine, a whole framework around it, and using it.
His example – 5 boxes, two red, two green and one blue – highlights the problem in interpreting very simple statement.
Take: (box (color ?c))
The “truth” perspective: What (distinct) colors are used by the boxes?
The “implementation” perspective: What color is each box?
The discrepancy, while cartoonish in this case, is true. This is why I believe we cannot really use low level languages without a lot of training, and this is why a non trained programmer – let alone a “business” type – cannot really leverage them. Even worse, they may think they can leading to potentially huge consequences.
David expresses this as: (paraphrasing) “we need to be explicit in our rules engines, we need to remove the ambiguities as early as possible in the expression chain”.
To illustrate the power of syntax and extended engine support for it (and I apologize if I misunderstood):
David’s example in terms of low level syntax
when (customer (state ?s)) then (logical-assert(customer-in-state ?s))
when (customer-in-state ?s) then (print ?s)
Supporting David’s point, the same in Blaze Advisor’s SRL (the pattern declarations are re-usable through rules and can have their own additional filtering clauses):
customer is any customer.
s is any state.
if (at least 1 customer satisfies it’s state is equal to s) then print s.
David went into “fact collisions”.
Take the following:
customer N is preferred if their volume is high
customer N is preferred if they’ve ordered recently
customer N is preferred
This last fact is unconditional. When it is retracted, what do we really mean? To retract, or to blow away the truth.
David makes the point that this is a problem at the low level, and that business rules systems do not surface it. I agree – this is precisely the reason by the engines supporting BRMS have invested in syntax and engine extensions and contraints.
David makes the following distinction between types of rules:
– Logical (stateless) rules
– Application (stateful) rules
With Application rules, keeping track of the state of the world (or the truth) is a core issue
I am not a fan of this distinction, and I do not think it brings a lot to David’s very good points. First, the issues with dealing with truth also exist in the typical stateless invocations simply because most decision is multi-step – not just one rules set/group activation, but many of them at different points in time through different conditions. Furthermore, the distinction between stateless invocation, short duration stateful invocation, long duration stateful invocations, are too coarse: it’s a fairly gray area when you look at what the industry does with these engines.
This is why the engines in BRMS implementations are not any simpler than those used at the lower level: they also have to cope with that and need to minimize these issues. They are actually more complex in their implementation – most of them started with well known implementations and extended them.
But except for that this difference in opinion on this classification, I agree with David’s points.
Another thing I would like to throw into the discussion is that we do need to get out of the closed world assumption whenever we end up dealing with anything connected to the rest of the world. We can simply not make the assumption that all we need to know is known or even knowable. That has a huge implication on this whole “truth” issue.
I am not going into a theoretical academic discussion here – I am going into the very concrete requirement that we need to engineer into the rules engine technology some level of support for open world realities – and I really mean “some”, as in at least a practical workable compromise. Some BRMS rules engines do support things like “unknown” as a potential value for anything. Others combine that with the support for things like “unavailable” – allowing to make the distinction between what can be knowable through opportunistic backwards chaining to get to the value, and what cannot has any chance of being known in the corresponding context. All the while allowing rules to actually reason against “unknown”, “unavailable”, “known”, “not known”, “available” values, etc…
This is not even special to rules engine. Witness what happens with DBMS and the semantic (ab)use of NULL.