[
https://issues.jboss.org/browse/ISPN-1255?page=com.atlassian.jira.plugin....
]
Erik Salter updated ISPN-1255:
------------------------------
Attachment: ispn1255.log
I still see the timeout in join() and the cache never recovers. It happens
intermittently, and it's difficult to reproduce reliably.
I updated the unit test to sort the keys as well (in the stressing thread) and updated the
key class to use the new Grouping API (it can only use strings), and I get timeout
exceptions. I get them AFTER rehashing -- maybe some locks are not cleaned up?
I attached a log after the rehash.
RequestIgnoredException on rehash using the Distributed Executor
Service
------------------------------------------------------------------------
Key: ISPN-1255
URL:
https://issues.jboss.org/browse/ISPN-1255
Project: Infinispan
Issue Type: Bug
Affects Versions: 5.0.0.CR7
Reporter: Erik Salter
Assignee: Vladimir Blagojevic
Fix For: 5.0.0.FINAL
Attachments: cacheTest.zip, ispn1255.log, server_node1.log, server_node2.log
My application exposes its distributed operations via a REST-based infrastructure. To
minimize the delta between JBoss starting and the cache starting, I used the new
Distributed Executor to "sticky" a task to the data owner of a set of keys (with
the same hash code).
NOTE: Rehash still causes problems seen in ISPN-1106. (Attached new logs)
I see a lot of the following error from the DistributedExecutorService when the new
node's cache doesn't start in a timely manner:
Reason: java.lang.IllegalStateException: Invalid response
{Satriani-52149(PHL)=RequestIgnoredResponse}
In addition, I see:
org.infinispan.util.concurrent.TimeoutException: Timed out waiting for valid responses!
It takes the cache about 2+ minutes at low throughput rate (30 tx/s) to recover. For
high throughput rate, the cluster doesn't recover.
--
This message is automatically generated by JIRA.
For more information on JIRA, see:
http://www.atlassian.com/software/jira