Hi All,
I've posted on this topic twice and logged a JIRA ticket
(
https://jira.jboss.org/browse/JBRULES-2651) as well. I've received
no responses and the bug hasn't been updated since I logged it.
I just replied
to the jira. I'm committing the fix now.
Mark
This is a serious issue as it causes my production system to freeze
up
and it has to be restarted. It's consistently reproducible (usually
takes a few days).
Can someone please take a quick look at the code? Does the call to
SingleThreadedObjectStore.addHandle in NamedEntryPoint.insert need to
be preceded by acquiring the lock?
Thanks again for your help.
Norman
------------------------------------------------------------------------
*From:* Norman C <rent_my_time(a)yahoo.com>
*To:* rules-users(a)lists.jboss.org
*Sent:* Wed, August 4, 2010 11:23:55 PM
*Subject:* Re: Possible concurrency issue in Drools
I've run into this issue a few more times. Should I log a JIRA ticket
for this? Any advice would be appreciated.
Thanks,
Norman
------------------------------------------------------------------------
*From:* Norman C <rent_my_time(a)yahoo.com>
*To:* rules-users(a)lists.jboss.org
*Sent:* Sat, July 31, 2010 9:56:26 PM
*Subject:* Re: Possible concurrency issue in Drools
All,
Just wanted to mention, I'm using version 5.0.1 of Drools.
Thanks,
Norman
------------------------------------------------------------------------
*From:* Norman C <rent_my_time(a)yahoo.com>
*To:* rules-users(a)lists.jboss.org
*Sent:* Sat, July 31, 2010 9:50:19 PM
*Subject:* Possible concurrency issue in Drools
Hi All,
I recently ran into an issue which I believe might point to a
concurrency issue. My server stopped processing new requests, so I
did a thread dump. In examining the dump, I found that all of the
processing threads, save two, were blocking while trying to acquire
the lock in NamedEntryPoint.insert. Both of the other two threads
appeared to be infinitely looping in the NamedEntryPoint.insert
method. Here are snippets of the stack traces:
ActiveMQ Session Task" prio=10 tid=0x00002aab0003b000 nid=0x7b98
runnable [0x000000004c086000..0x000000004c087c90]
java.lang.Thread.State: RUNNABLE at
org.drools.util.ObjectHashMap.remove(ObjectHashMap.java:121) at
org.drools.common.SingleThreadedObjectStore.removeHandle(SingleThreadedObjectStore.java:150)
at
org.drools.common.NamedEntryPoint.retract(NamedEntryPoint.java:296)
at
org.drools.common.NamedEntryPoint.retract(NamedEntryPoint.java:245) at
org.drools.reteoo.ReteooWorkingMemory$WorkingMemoryReteExpireAction.execute(ReteooWorkingMemory.java:350)
at
org.drools.common.AbstractWorkingMemory.executeQueuedActions(AbstractWorkingMemory.java:1488)
at
org.drools.common.NamedEntryPoint.insert(NamedEntryPoint.java:158)
at
org.drools.common.NamedEntryPoint.insert(NamedEntryPoint.java:122)
at org.drools.common.NamedEntryPoint.insert(NamedEntryPoint.java:80)
at
org.drools.common.NamedEntryPoint.insert(NamedEntryPoint.java:28) at
ActiveMQ Session Task" prio=10 tid=0x000000005a35cc00 nid=0xdf6
runnable [0x000000004a268000..0x000000004a269a90]
java.lang.Thread.State: RUNNABLE at
org.drools.util.AbstractHashTable.resize(AbstractHashTable.java:115)
at org.drools.util.ObjectHashMap.put(ObjectHashMap.java:78) at
org.drools.common.SingleThreadedObjectStore.addHandle(SingleThreadedObjectStore.java:136)
at
org.drools.common.NamedEntryPoint.insert(NamedEntryPoint.java:113)
at
org.drools.common.NamedEntryPoint.insert(NamedEntryPoint.java:80)
at org.drools.common.NamedEntryPoint.insert(NamedEntryPoint.java:28)
at
So it seems like one while the first thread is holding the lock and is
attempting to remove an object handle from the object store in
NamedEntryPoint, the other thread is trying to resize that same object
store in response to an addHandle call that puts it over the
threshold. I haven't worked out exactly how these concurrent accesses
to the same object store by two different threads causes an infinite
loop in both threads, but it seems like the call to
SingleThreadedObjectStore.addHandle should be preceded by acquiring
the lock.
Is this correct? I can imagine that resizing a large hash map could
potentially take a long time and thus synchronizing this call could
impact performance, but somehow, the action of resizing the table must
be protected in some way from adversely impacting other operations on
the table.
Any help would be appreciated.
thanks,
Norman
_______________________________________________
rules-users mailing list
rules-users(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users