Rule Engine doesn’t work on all rules
Sometimes may be better to handle the rules yourself
I want to share about the difficulties I ran into when using a rule engine. The specific tool I used is Drools.
I still think rule engine has its place in the decision making process, but in practice it is only useful if the rules has no dependencies on each other.
Just like any tools, rather it is the right tool depends on your application.
What are the pain points of using Drools?
0. Business People specified the what, you still have to deal with the how (in code)
For example, if the rule is:
If given a building, which contains capital expense and operating expense, then calculate the building net cash flow
The “if” part is straight forward, but you still have to implement the “then” part in Java.
The drl file look something like:
rule "Building Net Cash Flow"
when
BuildingResult: BuildingResult($capEx: capEx != null, $opEx: opEx != null)
then
List<Double> result = new ArrayList(Collections.nCopies($capEx.size(), 0d);
IntStream.range(0, $capEx.size()).
forEach(i -> result.set(i, $capEx.get(i) + $opEx.get(i)));
buildingResult.setNetCashFlow(result);
The logic is simple, capEx and opEx are both an array of monetary value with equal size. The net cash flow is just summing up the two arrays.
The point is, this is still a valid business rule. But do you expect a business person to write the implementation inside the “then” part?
This is essentially a java file masquerading as rule file.
1. Over use of Salience
The project I worked has a set of calculations that depend on the result of a previous calculation. In some file, we get to the point that every single rule has its own salience.
For example, rule Net cumulative cash flow
depends on the result of rule Building Net Cash Flow
and Building Net Cash Flow
depends on another rule Building CapEx Npv
The drl file look something like:
rule "Building CapEx Npv" salience -1
when
// Given timeline, and all the capEx from different category
then
// sum into a single array of capEx for that building
rule "Building Net Cash Flow" salience -2
when
// Given Building CapEx and OpEx
then
// sum into a single array of net cash flow
rule "Net cumulative cash flow" salience -3
when
// given building net cash flow
then
// for each index, sum all the number before the index
... for rest of the rules
One of the benefit of using rule engine is that it use pattern matching algorithm like Rete to decide what rules to fire optimally.
The concern I have is: if we are specifying the ordering like this, it is the same as not using a rule engine at all. You can execute the same code in Java, one line after another.
2. You will need a separate objects to store the result
The nature of my rules are almost like many functions. Given the inputs, calculate some outputs.
But rules are not functions, you cannot return result from rules.
So what we did was create an object and passed it into each rule to store the result.
rule "Building CapEx Npv" salience -1
when
BuildingResult();
then
// sum into a single array of capEx for that building
buildingResult.setCapEx(result);
rule "Building Net Cash Flow" salience -2
when
// Given Building CapEx and OpEx
BuildingResult($capEx);
then
// sum into a single array of net cash flow
buildingResult.setNetCashFlow(result);
It is a minor pain to have to pass in the same object around into all the rules.
There are also global variables, but once they are set they cannot be changed.
3. You are responsible to update the facts in the working memory
Drools always evaluate the condition using the state when the object was put into the working memory. Therefore if you have a rule that depends on the outcome of another rule, you must notify drools that the fact has changed.
For example, “Net cumulative cash flow” depends on “Building Net Cash Flow” to be calculated.
rule "Building Net Cash Flow"
when
// Given Building CapEx and OpEx
then
// sum into a single array of net cash flow
result = // some calculation
modify(buildingResult){
setNetCashFlow(result);
}
rule "Net cumulative cash flow"
when
// given building net cash flow from above^
BuildingResult($netCashflow: netCashflow != null)
then
// for each index, sum all the number before the index
Without the modify() inside “Building Net Cash Flow”, Drools will always evaluate the condition in “Net cumulative cash flow” to be false. Because the netCashFlow field in BuildingResult
object is indeed null when BuildingResult
is put into the working memory.
This is extra Burden that you wouldn’t have if doing normal Java.
4. As soon as you introduce update() or modify(), you are in risk of running into infinite loops
The purpose of update() and modify() is to notice rule engine that “Hey this variable has been updated! check all the other rules that used this variable in the condition, if the change to this variable satisfy the condition now, fire those rules”
You can cause the rule engine to reevaluate the same rule over and over again.
rule "Raise building insurance when floorArea > 10000" when
$b : Building(floorArea > 10000, $insurance < 10000)
then
modify( $b ) {
setInsurance = $insurance + 500; // will keep fireing itself
};
end
rule "Rule that also depend on $insurance"
$b: Building($insurance < 1000)
then
// do whatever
There are a couple things you can do to prevent infinite loop:
- no-loop: avoid the re-activation of a rule caused by the RHS of that SAME rule.
- lock-on-active: avoid the re-activation of a rule NO MATTER what the cause is.
- add some sort of “control facts” into your rule condition
- try combine or rewrite your rule to avoid the need to update()
This become a huge problem when you have a large set of rules and they create loop among themselves
5. With update(), there is no way for business people to understand what is going on anymore
The introduction of update() and all these Drools attributes help developer solve the problem raised above. But it contradicts with one of the biggest reason we use rule engine:
“Rules are very easy to read and code by any non-technical person like business analyst, client team”
I strongly disagree non-technical people are suppose to deal with all these infinite loop, object state, etc. You, the developer, will be doing the explaining & debugging, if they make changes that “don’t seem to work”
6. Difficult to debug why a rule did not fired or fired again
Unless you print out the state of your objects at the end of each rule. There is really no good way to tell which condition failed. Which rule in a chain didn’t ran causing the rest of the rules in the chain not fire, etc
You will at least have the debugger & breakpoints in Java.
7. Hard to detect problems because the rules running sequence can be different each time
Unless you put Salience in every rule, Drools is free to fire the rules in whatever order as long as the condition is satisifed.
Your update() logic might work in a way only when Drools fire ruleA before ruleB, but not ruleB before ruleA.
It is possible that Drools fire ruleA before ruleB during your smoke test, but decides to fire ruleB before ruleA when you deploy to production.
8. It is hard to do unit test, since everything get fired anyway
Unless you have “control facts” in your rule conditions to prevent all rules in the same file from running, you will need to prepare all the mock objects to support all the rules to run. (For example, to avoid the null-pointer exceptions because you are lazy to provide all the objects used)
To make unit testing easy, you probably want to test a small set of the rules at a time. Not all the rules together in the same drool file.
It is extremely difficult to prepare all the mock objects to test all the rules together in the same drool file.
9. Drools documentation is not helpful.
It took me quite some effort to find out what is the different between update() and modify(). Well, this is objective so let’s leave it at that.
10. There is some effort to log what rules are ran
This one is important for debugging. There is actually a way to set up listener to monitor what rules are running.
This is great, but I would prefer for something like config.setDebugMode(true);
who should use rule engine
if you expect frequent business logic changes
The business people could look at the text description and quickly point out nuances in rules, exceptions, and other special cases
if your rule is made up of a large set of if-else statement
rule engine is good at processing large bases of rules. Algorithms like Rete are efficient for quickly matching against large rule sets.
what are somethings you can do if you decide to use drool
- control what get pass into the working memory
- break down your rules, group them into agenda group
Reference: