[jbossts-issues] [JBoss JIRA] Commented: (JBTM-599) Synchronization problem in CacheStore

Andrew Dinn (JIRA) jira-events at lists.jboss.org
Tue Aug 4 09:05:29 EDT 2009


    [ https://jira.jboss.org/jira/browse/JBTM-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12478832#action_12478832 ] 

Andrew Dinn commented on JBTM-599:
----------------------------------

I looked further into this and managed to isolate the problem. Each of the test threads does an add then a delete of an object store record. I used a byteman script to allow 50 threads to proceed and then stall the next 50 threads for a few seconds at the call to add.Work This caused the stall to happen every time I ran the test.

The remove operations alter the wait condition state because they increase the number of pending remove operations. This can cause the condition test made in addWork to see the cache as full. However, the removing threads do not test the condition and notify the AsyncCache thread when the state changes to full.

So, the AsyncStore thread can deplete the queue of all pendiing adds and removes and then go into a wait. If this is followed by enough removes with no adds then the cache fills and further adds are stalled. This leaves them stalled until the AsyncStore thread wakes up.

The fix is to test the wait condition during remove and, if the cache is full notify the AsyncStore thread.


> Synchronization problem in CacheStore
> -------------------------------------
>
>                 Key: JBTM-599
>                 URL: https://jira.jboss.org/jira/browse/JBTM-599
>             Project: JBoss Transaction Manager
>          Issue Type: Bug
>      Security Level: Public(Everyone can see) 
>          Components: Transaction Core
>    Affects Versions: 4.7.0
>            Reporter: Andrew Dinn
>
> This bug manifests occasionally when running the CachedTest. The AsyncStore thread suspends inside its run method on a (120 sec) timed wait because the cache is not full. When I managed to catch this case in the debugger I found that there were many writer threads (~30) suspended inside addWork in an untimed wait on the overflow lock. So, these writing threads  make  no progress for 120 seconds. The AsyncStore thread does notify the object on which the writers are waiting but there is clearly a window where the writers can go to sleep while the cache is full and not get notified when the AsyncStore thread has emptied it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jbossts-issues mailing list