[infinispan-issues] [JBoss JIRA] (ISPN-2697) HotRodServer startup fails when its record cannot be inserted into topology cache

Dan Berindei (JIRA) jira-events at lists.jboss.org
Thu Jan 24 11:28:47 EST 2013


    [ https://issues.jboss.org/browse/ISPN-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750300#comment-12750300 ] 

Dan Berindei commented on ISPN-2697:
------------------------------------

Bela, it's hacky because we don't have access to the message when we invoke the put command, we have to sneak around in CommandAwareRpcDispatcher, sniff for the particular command we want to RSVP, and add the flag there.

The way we added the RSVP flag for state transfer commands was hacky as well, but at least there it was a simple hack: CommandAwareRpcDispatcher checks if the command is an instance of a certain class (3 classes now, but originally it was a single class) and then sets the flag. 

For the HotRod server, the command is a PutKeyValueCommand, but we don't want to add the RSVP flag to all the PutKeyValueCommands. So we would need additional checks (perhaps the cache name).

Actually this is another problem with RSVP: we can't use it on all the commands. So we still have to set a reasonable STABLE timeout for the regular traffic, otherwise the server will start but clients will still get TimeoutExceptions while doing their stuff.

IMO we should strive to detect missing messages and retransmit them in < 1 second. It's true that messages aren't lost very often, unless something is misconfigured, but it's something that does happen and 1s is already 1000x the time it takes to send a message normally.
                
> HotRodServer startup fails when its record cannot be inserted into topology cache
> ---------------------------------------------------------------------------------
>
>                 Key: ISPN-2697
>                 URL: https://issues.jboss.org/browse/ISPN-2697
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Remote protocols
>    Affects Versions: 5.2.0.Beta6
>            Reporter: Radim Vansa
>            Assignee: Galder Zamarreño
>            Priority: Critical
>             Fix For: 5.2.0.Final
>
>
> When the HotRodServer starts it inserts its record to __hotRodTopologyCache ({{HotRodServer.addSelfToTopologyView(...)}}).
> However, this put may very easily fail - as the command is broadcasted using NAKACK2 protocol, if the message gets lost and there's no following broadcasted message, the message will be not retransmitted and the put operation times out (Replication timeout), which fails the whole HotRodServer startup, all because of one lost UDP message.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the infinispan-issues mailing list