[infinispan-issues] [JBoss JIRA] (ISPN-2262) Unresponsiveness in resilience test
Dan Berindei (JIRA)
jira-events at lists.jboss.org
Thu Sep 13 08:46:33 EDT 2012
[ https://issues.jboss.org/browse/ISPN-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718236#comment-12718236 ]
Dan Berindei edited comment on ISPN-2262 at 9/13/12 8:45 AM:
-------------------------------------------------------------
A node can receive a message from a joiner before it has installed a JGroups
view that contains the joiner. If it tries to send a response immediately,
it will fail with a SuspectException. And since we don't retry sending state,
the joiner will never finish the state transfer.
The fix is to wait for the local node to install the new JGroups view before
handling any state request. This can be done indirectly, by waiting on the
new CacheTopology instead.
was (Author: dan.berindei):
A node can receive a message from a joiner before it has installed a JGroups
view that contains the joiner. If it tries to send a response immediately,
it will fail with a SuspectException. And since we don't retry sending state,
the joiner will never finish the state transfer.
> Unresponsiveness in resilience test
> -----------------------------------
>
> Key: ISPN-2262
> URL: https://issues.jboss.org/browse/ISPN-2262
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.0.Alpha3
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Blocker
> Fix For: 5.2.0.Alpha4
>
>
> Basic resilience scenario in library mode:
> * spawn 4 nodes
> * kill one node (kill = insert DISCARD protocol and then call CacheManger.stop())
> * start one node
> We use RadarGun (with some modifications) for this test.
> When the node is started again the system looses some messages as some timeouts are triggered (and the RadarGun stage acknowledgement is missing as well) and later put/get requests cause exceptions.
> The trace logs are too big, therefore, these are published on http://dl.dropbox.com/u/103079234/serverlogs_trace.zip
> Note that the test was shutdown manually.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list