[rules-users] Equality semantics of logical assertions and stated assertions

Fri Jun 27 12:16:12 EDT 2014

I understand that logical insertions in Drools does not insert two facts that
are equal according to their #equals implementation (and associated
#hashCode logic).
I'm basing this off of my own investigation plus the (well written)
documentation @
http://docs.jboss.org/drools/release/5.2.0.Final/drools-expert-docs/html/ch03.html.

However, normal/stated fact assertions only use identity-based equality
checks - by default.
It looks like this can be changed to value-based equality via
RuleBaseConfiguration.

I have found some related, old Jiras:
https://issues.jboss.org/browse/JBRULES-266
https://issues.jboss.org/browse/JBRULES-233

In both cases, Mark Proctor had a final remark, but I wasn't able to
understand what was really decided and why.

One specific comment @ https://issues.jboss.org/browse/JBRULES-266 is,
"logical assertions are always equality based. Stated facts can now use
identity and equality - using your configuration settings.
 I have added another configuration setting to dictate how logical over-ride
works. Does it discard the original fact, or turn it into two stated fact
handles."

What is the reasoning behind only using the value-based equality for logical
assertions?

Why is it not configurable as stated fact assertions are?
Also, why is the default behavior of stated fact assertion to be
identity-based equality?

- I find the inconsistency between the default for normal assertions and the
(only option) for logical assertion to be a bit confusing/inconsistent.

I'm just looking for an understanding the motivation behind these design
decisions.
I have had a lot of performance issues using, primarily, logical assertions
due to the "no duplicate" checking being done.

Even the stated assertions do still come with a cost of creating the initial
FactHandle.
With deeply nested/complex domain objects, the #equals and #hashCode
implementations may be quite expensive when executed many times throughout
rule execution.
This is especially prominent in accumulators that may later accumulate these
objects into collections that then are also subject to this equals/hashCode
"penalty".
I have a related post a while back @
http://drools.46999.n3.nabble.com/Object-size-impact-on-session-insertion-performance-td4028244.html.

I do not believe it is typical to use complex domain model objects as the
keys to something like a HashMap.
One reason (there are more; especially for mutable objects) would be due to
the performance around having them be hashed and equality checked regularly
when the
map is accessed, modified, etc.

The creation of many FactHandle's has shown up in some of my profiling
performance problems due to this the constructor immediately hashing my
domain model objects.
I think the reason the fact handles use this information is for value-based
equality comparisons - such as the only method used by the Truth Maintenance
System logical assertions.

I would really appreciate more insight into the motivation for not allowing
#equals facts to be logically asserted separately into working memory -
especially since there is a price to pay for the behavior in my case.

--
View this message in context: http://drools.46999.n3.nabble.com/Equality-semantics-of-logical-assertions-and-stated-assertions-tp4030179.html
Sent from the Drools: User forum mailing list archive at Nabble.com.