[rules-dev] BUG: [5.3.0.Final] CollectSetAccumulateFunction should probably use IdentityHashMap internally

Edson Tirelli ed.tirelli at gmail.com
Tue Feb 28 12:43:30 EST 2012


   I think the key here is that the functions are plugable and the contract
between the engine and the function [1] is a single object that is passed
into the function methods:

public void accumulate(Serializable context,
                           Object value);
*
*

public void reverse(Serializable context,
                        Object value) throws Exception;


   So, while the engine knows how to execute and obtain the value of the
expression, e.g.:

a) collectSet( $firstName + $lastName )
b) collectSet( $person )

   It only sends to the function the actual result. In case of (a), an
immutable String object, in case of (b), a mutable Person object. The
engine does use a cached value on the retract, so the same instances of
both String and Person are used on the reverse method call, but the
function itself doesn't know if that is a mutable or immutable object. It
also doesn't know if it is an object resulting from a complex expression
(a) or a fact straight from the working memory (b).

   As you noticed already, the function uses the resulting object as a key
in the map because the goal is indeed to match equals() objects and filter
them to return a set, no duplicates. Of course, because in case of (b) the
hashcode changes, java no longer finds the key to remove it from the map.

   The problem of using a straight identity map is that it will no longer
collect sets of equals() object, but sets of identical objects, what is
obviously not the intent of the function. For instance, in case (a), if two
objects have the same first and last name, there will be 2 instances of the
resulting object created, even if they are equals(). That will break the
function.

   Using a combination of lists and maps to maintain the values might work,
but as you mentioned it will give us O(N) performance. Anyway, I need time
to think about it, and unfortunately extremely busy at the moment with
another task.

   Thanks for looking into it and if you find a solution, please let us
know. :)

    Edson


[1]
https://github.com/droolsjbpm/droolsjbpm-knowledge/blob/master/knowledge-api/src/main/java/org/drools/runtime/rule/AccumulateFunction.java#L54

On Tue, Feb 28, 2012 at 11:55 AM, SirMungus <Patrick_Rusk at ssga.com> wrote:

>
> Edson Tirelli-4 wrote
> >
> > a) Please note that clashes objects was taken into account in the design
> > of the solution, but there is a bug in there.
> >
> I eventually surmised that from seeing the use of MutableInt within it.
>
> Edson Tirelli-4 wrote
> >
> > b) If you look at [1] you will see that the value that is being collected
> > is already cached by the engine by fact handle ID and used on reverse. So
> > that takes care of expressions, for instance, where a new object is
> > created every time. For instance:
> >
> > collectSet( $firstName + $lastName )
> >
> > In that case, every time the expression is evaluated, it will create a
> new
> > immutable String object that we tract by fact ID, cache it, and properly
> > manage with equality semantics.
> >     *    *    *
> > d) The problem is that we are collecting *sets* of objects, and because
> of
> > that, we have to use equality semantics to check for duplicates. I was
> > using a java Map to track that, and that works find for the use case (b)
> > above, but obviously breaks on use case (c). Moving to identity does not
> > seem correct to me as it would work for use case (c), but no longer work
> > for (b).
> >
> > I don't have a solution yet, as I am involved in another large complex
> > task, but I will re-evaluate this as soon as possible. Meanwhile, if
> > someone has a suggestion that supports both cases above, please let us
> > know.
> >
> Your scenario (b) is a very interesting point.  I'm getting a bit beyond my
> depth here, but when the existing node in the RETE tree changes such that
> it
> realizes it will no longer contribute to the rule, does it re-evaluate the
> "$firstname + $lastname" and then try to remove it, or does it use a cached
> FactHandle from its prior evaluation.  If it re-evaluates, I agree that it
> is a problem.  If it uses the FactHandle from before, then the
> IdentityHashMap implementation should be fine, because the equals()
> semantics will be applied in the getResults().  Your text above reads to me
> as though the latter is the case.  So, aren't we fine here?
>
> --
> View this message in context:
> http://drools.46999.n3.nabble.com/BUG-5-3-0-Final-CollectSetAccumulateFunction-should-probably-use-IdentityHashMap-internally-tp3774079p3784627.html
> Sent from the Drools: Developer (committer) mailing list mailing list
> archive at Nabble.com.
> _______________________________________________
> rules-dev mailing list
> rules-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-dev
>



-- 
  Edson Tirelli
  JBoss Drools Core Development
  JBoss by Red Hat @ www.jboss.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-dev/attachments/20120228/df2ba3de/attachment.html 


More information about the rules-dev mailing list