I think the key here is that the functions are plugable and the contract between the engine and the function [1] is a single object that is passed into the function methods:

public void accumulate(Serializable context,
                           Object value);

public void reverse(Serializable context,
                        Object value) throws Exception;

   So, while the engine knows how to execute and obtain the value of the expression, e.g.:

a) collectSet( $firstName + $lastName )
b) collectSet( $person )

   It only sends to the function the actual result. In case of (a), an immutable String object, in case of (b), a mutable Person object. The engine does use a cached value on the retract, so the same instances of both String and Person are used on the reverse method call, but the function itself doesn't know if that is a mutable or immutable object. It also doesn't know if it is an object resulting from a complex expression (a) or a fact straight from the working memory (b).

   As you noticed already, the function uses the resulting object as a key in the map because the goal is indeed to match equals() objects and filter them to return a set, no duplicates. Of course, because in case of (b) the hashcode changes, java no longer finds the key to remove it from the map.

   The problem of using a straight identity map is that it will no longer collect sets of equals() object, but sets of identical objects, what is obviously not the intent of the function. For instance, in case (a), if two objects have the same first and last name, there will be 2 instances of the resulting object created, even if they are equals(). That will break the function.

   Using a combination of lists and maps to maintain the values might work, but as you mentioned it will give us O(N) performance. Anyway, I need time to think about it, and unfortunately extremely busy at the moment with another task.

   Thanks for looking into it and if you find a solution, please let us know. :)

    Edson


[1] https://github.com/droolsjbpm/droolsjbpm-knowledge/blob/master/knowledge-api/src/main/java/org/drools/runtime/rule/AccumulateFunction.java#L54

On Tue, Feb 28, 2012 at 11:55 AM, SirMungus <Patrick_Rusk@ssga.com> wrote:

Edson Tirelli-4 wrote
>
> a) Please note that clashes objects was taken into account in the design
> of the solution, but there is a bug in there.
>
I eventually surmised that from seeing the use of MutableInt within it.

Edson Tirelli-4 wrote
>
> b) If you look at [1] you will see that the value that is being collected
> is already cached by the engine by fact handle ID and used on reverse. So
> that takes care of expressions, for instance, where a new object is
> created every time. For instance:
>
> collectSet( $firstName + $lastName )
>
> In that case, every time the expression is evaluated, it will create a new
> immutable String object that we tract by fact ID, cache it, and properly
> manage with equality semantics.
>     *    *    *
> d) The problem is that we are collecting *sets* of objects, and because of
> that, we have to use equality semantics to check for duplicates. I was
> using a java Map to track that, and that works find for the use case (b)
> above, but obviously breaks on use case (c). Moving to identity does not
> seem correct to me as it would work for use case (c), but no longer work
> for (b).
>
> I don't have a solution yet, as I am involved in another large complex
> task, but I will re-evaluate this as soon as possible. Meanwhile, if
> someone has a suggestion that supports both cases above, please let us
> know.
>
Your scenario (b) is a very interesting point.  I'm getting a bit beyond my
depth here, but when the existing node in the RETE tree changes such that it
realizes it will no longer contribute to the rule, does it re-evaluate the
"$firstname + $lastname" and then try to remove it, or does it use a cached
FactHandle from its prior evaluation.  If it re-evaluates, I agree that it
is a problem.  If it uses the FactHandle from before, then the
IdentityHashMap implementation should be fine, because the equals()
semantics will be applied in the getResults().  Your text above reads to me
as though the latter is the case.  So, aren't we fine here?

--
View this message in context: http://drools.46999.n3.nabble.com/BUG-5-3-0-Final-CollectSetAccumulateFunction-should-probably-use-IdentityHashMap-internally-tp3774079p3784627.html
Sent from the Drools: Developer (committer) mailing list mailing list archive at Nabble.com.
_______________________________________________
rules-dev mailing list
rules-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-dev



--
  Edson Tirelli
  JBoss Drools Core Development
  JBoss by Red Hat @ www.jboss.com