[
https://issues.jboss.org/browse/ISPN-1255?page=com.atlassian.jira.plugin....
]
Dan Berindei edited comment on ISPN-1255 at 7/22/11 7:49 AM:
-------------------------------------------------------------
I decreased the GMS timeouts and the RequestIgnoredResponses disappeared from my log:
{code:xml}
<pbcast.GMS print_local_addr="true"
join_timeout="3000"
leave_timeout="3000"
merge_timeout="60000"
view_ack_collection_timeout="2000"
view_bundling="true"
max_bundling_time="1000"/>
{code}
InboundInvocationHandler will send back a {{RequestIgnoredResponse}} if the cache startup
takes more than 30 seconds, so I'm pretty sure that GMS somehow delays the new
node's startup procedure by {{join_timeout}} milliseconds.
was (Author: dan.berindei):
I decreased the GMS timeouts and the RequestIgnoredResponses disappeared from my log:
{code:xml}
<pbcast.GMS print_local_addr="true"
join_timeout="3000"
leave_timeout="3000"
merge_timeout="60000"
view_ack_collection_timeout="2000"
view_bundling="true"
max_bundling_time="1000"/>
{code}
I'm pretty sure that this is because GMS delays the new node's start procedure by
{{join_timeout}} milliseconds, and
RequestIgnoredException on rehash using the Distributed Executor
Service
------------------------------------------------------------------------
Key: ISPN-1255
URL:
https://issues.jboss.org/browse/ISPN-1255
Project: Infinispan
Issue Type: Bug
Affects Versions: 5.0.0.CR7
Reporter: Erik Salter
Assignee: Vladimir Blagojevic
Fix For: 5.0.0.FINAL
Attachments: cacheTest.zip, server_node1.log, server_node2.log
My application exposes its distributed operations via a REST-based infrastructure. To
minimize the delta between JBoss starting and the cache starting, I used the new
Distributed Executor to "sticky" a task to the data owner of a set of keys (with
the same hash code).
NOTE: Rehash still causes problems seen in ISPN-1106. (Attached new logs)
I see a lot of the following error from the DistributedExecutorService when the new
node's cache doesn't start in a timely manner:
Reason: java.lang.IllegalStateException: Invalid response
{Satriani-52149(PHL)=RequestIgnoredResponse}
In addition, I see:
org.infinispan.util.concurrent.TimeoutException: Timed out waiting for valid responses!
It takes the cache about 2+ minutes at low throughput rate (30 tx/s) to recover. For
high throughput rate, the cluster doesn't recover.
--
This message is automatically generated by JIRA.
For more information on JIRA, see:
http://www.atlassian.com/software/jira