[rules-dev] BUG: [5.3.0.Final] CollectSetAccumulateFunction should probably use IdentityHashMap internally

Wolfgang Laun wolfgang.laun at gmail.com
Tue Feb 28 05:28:36 EST 2012


How do you find the List for removal by identity when the key has
already been changed?
-W

On 28/02/2012, Mark Proctor <mproctor at codehaus.org> wrote:
> Actually this can be done without two collections.
>
> Just use a HashMap and mainta a List of the "clashed" identities. To
> operate on the actualy identity, such as remove. Lookup the List, via
> normal map.get( key ) then iterate the list to remove the identity. If
> the list is empty, remove the key. reverse removes the entity, makes it
> change, and then adds it back in. Adding is the process of looking for
> an existing key and adding to it's clash list, or creating a new key
> with an entry of one.
>
> This would provide identity safe update/removal/reverse will providing a
> set view of the world. I suspect overhead would be < 15% if done right.
>
> It should be done as a custom accumulate function for now, and we should
> profile the performance characteristics of the two - as there are some
> optimizations that can be made, such as only lazy storying the value as
> a list only when there is more than one entry. If you are willing to
> code this up, we can give you pointers on irc:
> http://www.jboss.org/drools/irc
> http://www.athico.com/Getting_Involved/gettinginvolved.html
>
> Mark
> On 28/02/2012 07:25, Mark Proctor wrote:
>>  The getResults()
>> would probably have to change from...
>>
>>     public Object getResult(Serializable context) throws Exception {
>>         CollectListData data = (CollectListData) context;
>>         return Collections.unmodifiableSet( data.map.keySet() );
>>     }
>>
>> ...to...
>>
>>     public Object getResult(Serializable context) throws Exception {
>>         CollectListData data = (CollectListData) context;
>>         return Collections.unmodifiableSet( new
>> HashSet(data.map.keySet())
>> );
>>     }
>>
>> The proposed getResult() results in a full HashSet copy for each
>> change of Rete, that performance hit would be unbearable for most people.
>>
>> I do agree that there is some merit to what you are asking, but we
>> can't give that perf hit to everyone. It would be better to maintain
>> two collections, one identity and one set based. And some how only
>> change in set what was relevant to the change in identity, you might
>> need to build a custom hash table for this - see AbstractHashTable in
>> drools which can easily be extended for custom keys and custom
>> add/remove methods.
>>
>> Mark
>>
>>
>> On 27/02/2012 16:53, SirMungus wrote:
>>> Mark Proctor wrote
>>>> accumulate supports custom functions, just add your own version of
>>>> collectSet that uses IdentityMap.
>>>>
>>> I could do that, but please let me take one more stab, just in case I
>>> haven't been clear enough.
>>>
>>> 1) I am not suggesting that collectSet() return an identity-based
>>> set. I am
>>> content to have it return a normal set, though I had early confusion
>>> on that
>>> point.
>>>
>>> 2) The discussion above about stated vs. justified facts turns out to
>>> be a
>>> distraction to the main point of the bug that I believe to be in
>>> CollectSetAccumulateFunction.  My failure to understand the
>>> distinction is
>>> not germane to the bug.
>>>
>>> 3) I am not disagreeing with the value behind implementing equals()
>>> to only
>>> take into account invariant fields, though I am stating that goes
>>> beyond the
>>> explicity documentation of equals()/hashCode() and is explicitly *not*
>>> followed by many classes in the JRE (most notably, collections).
>>>
>>> There is nothing about the purpose of collectSet() that logically
>>> suggests
>>> that it should only work with objects that have invariant
>>> equals()/hashCode() functions. If five objects match a rule, they
>>> should be
>>> able to be collected into a set (which may end up with fewer than five
>>> objects). If things change such that only three facts match the rule,
>>> they
>>> should still be able to be collected into a set.  There's no compelling
>>> reason why that should only apply to objects with invariant equals().
>>> And it
>>> seems to me that you rarely want to leave an NPE in code, rather than
>>> catching it and throwing a more intelligible exception (e.g., "Object
>>> to be
>>> removed from collectSet() was not found").
>>>
>>> If the internal maintenance of state in CollectSetAccumulateFunction
>>> used
>>> identity semantics and applied the equality semantics only when you
>>> asked it
>>> for its results, it would maintain the same external behavior
>>> (returning a
>>> "normal" equality-based set) without failing to work for objects with
>>> variant equals.
>>>
>>> If you Google around for the proper implementation of equals(), you will
>>> find many serious sites that discuss it at length, yet provide examples
>>> like:
>>>
>>> public boolean equals(Object aThat){
>>>    if ( this == aThat ) return true;
>>>    if ( !(aThat instanceof Car) ) return false;
>>>    Car that = (Car)aThat;
>>>    return
>>>      EqualsUtil.areEqual(this.fName, that.fName)&&
>>>      EqualsUtil.areEqual(this.fNumDoors, that.fNumDoors)&&
>>>      EqualsUtil.areEqual(this.fGasMileage, that.fGasMileage)&&
>>>      EqualsUtil.areEqual(this.fColor, that.fColor)&&
>>>      Arrays.equals(this.fMaintenanceChecks, that.fMaintenanceChecks);
>>> //array!
>>> }
>>>
>>> Ironically, that is explicitly against Wolfgang's suggestion above!
>>>
>>>
>>> --
>>> View this message in context:
>>> http://drools.46999.n3.nabble.com/BUG-5-3-0-Final-CollectSetAccumulateFunction-should-probably-use-IdentityHashMap-internally-tp3774079p3781313.html
>>> Sent from the Drools: Developer (committer) mailing list mailing list
>>> archive at Nabble.com.
>>> _______________________________________________
>>> rules-dev mailing list
>>> rules-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/rules-dev
>>
>
> _______________________________________________
> rules-dev mailing list
> rules-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-dev
>


More information about the rules-dev mailing list