[infinispan-issues] [JBoss JIRA] (ISPN-2697) HotRodServer startup fails when its record cannot be inserted into topology cache
Radim Vansa (JIRA)
jira-events at lists.jboss.org
Tue Jan 15 08:24:21 EST 2013
[ https://issues.jboss.org/browse/ISPN-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745881#comment-12745881 ]
Radim Vansa commented on ISPN-2697:
-----------------------------------
@Dan Berinder: it is not the rate of STABILITY (there's no message STABLE) which stays constant but STABLE_GOSSIP. In fact, STABILITY rate increases with increasing cluster size, and this is the period which should be guarded against sync.repl_timeout. However, I think that the period of STABILITY messages on large clusters is pretty long (do we want to set repl_timeout to 6 minutes for 64 node cluster with gossip period 5 seconds?)
> HotRodServer startup fails when its record cannot be inserted into topology cache
> ---------------------------------------------------------------------------------
>
> Key: ISPN-2697
> URL: https://issues.jboss.org/browse/ISPN-2697
> Project: Infinispan
> Issue Type: Bug
> Components: Remote protocols
> Affects Versions: 5.2.0.Beta6
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.2.0.CR2
>
>
> When the HotRodServer starts it inserts its record to __hotRodTopologyCache ({{HotRodServer.addSelfToTopologyView(...)}}).
> However, this put may very easily fail - as the command is broadcasted using NAKACK2 protocol, if the message gets lost and there's no following broadcasted message, the message will be not retransmitted and the put operation times out (Replication timeout), which fails the whole HotRodServer startup, all because of one lost UDP message.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list