[infinispan-issues] [JBoss JIRA] Updated: (ISPN-1255) RequestIgnoredException on rehash using the Distributed Executor Service

Erik Salter (JIRA) jira-events at lists.jboss.org
Fri Jul 22 16:48:23 EDT 2011


     [ https://issues.jboss.org/browse/ISPN-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Salter updated ISPN-1255:
------------------------------

    Attachment: ispn1255.log


I still see the timeout in join() and the cache never recovers.  It happens intermittently, and it's difficult to reproduce reliably.

I updated the unit test to sort the keys as well (in the stressing thread) and updated the key class to use the new Grouping API (it can only use strings), and I get timeout exceptions.  I get them AFTER rehashing -- maybe some locks are not cleaned up?

I attached a log after the rehash. 


> RequestIgnoredException on rehash using the Distributed Executor Service
> ------------------------------------------------------------------------
>
>                 Key: ISPN-1255
>                 URL: https://issues.jboss.org/browse/ISPN-1255
>             Project: Infinispan
>          Issue Type: Bug
>    Affects Versions: 5.0.0.CR7
>            Reporter: Erik Salter
>            Assignee: Vladimir Blagojevic
>             Fix For: 5.0.0.FINAL
>
>         Attachments: cacheTest.zip, ispn1255.log, server_node1.log, server_node2.log
>
>
> My application exposes its distributed operations via a REST-based infrastructure.  To minimize the delta between JBoss starting and the cache starting, I used the new Distributed Executor to "sticky" a task to the data owner of a set of keys (with the same hash code). 
> NOTE:  Rehash still causes problems seen in ISPN-1106.  (Attached new logs)
> I see a lot of the following error from the DistributedExecutorService when the new node's cache doesn't start in a timely manner: 
> Reason: java.lang.IllegalStateException: Invalid response {Satriani-52149(PHL)=RequestIgnoredResponse}
> In addition, I see:
> org.infinispan.util.concurrent.TimeoutException: Timed out waiting for valid responses!
> It takes the cache about 2+ minutes at low throughput rate (30 tx/s) to recover.  For high throughput rate, the cluster doesn't recover. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the infinispan-issues mailing list