[infinispan-issues] [JBoss JIRA] (ISPN-1654) Topology view management in Hotrod Server is not reliable

Wed Jan 4 12:22:10 EST 2012

    [ https://issues.jboss.org/browse/ISPN-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653849#comment-12653849 ] 

Galder Zamarreño commented on ISPN-1654:
----------------------------------------

Nice catch guys. It was a problem with the different order. I was using Scala iterators which returns true to contains only if the element is present and it's in the same position. The fact that in the old view n3 and n2 were swapped around was causing issues. Switching to iterables works fine. This is just a problem for 5.1 beta/cr releases which is when I changed this code.

> Topology view management in Hotrod Server is not reliable
> ---------------------------------------------------------
>
>                 Key: ISPN-1654
>                 URL: https://issues.jboss.org/browse/ISPN-1654
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Distributed Cache
>    Affects Versions: 5.1.0.CR1
>            Reporter: Jacek Gerbszt
>            Assignee: Galder Zamarreño
>            Priority: Blocker
>             Fix For: 5.1.0.CR3
>
>         Attachments: client.log.gz, front-21.log.gz, front-24.log.gz
>
>
> There is a problem with management of cluster topology view in address cache. You can see that by using remote client - there is the only way I know to see what is inside the address cache.
> When I restart the whole cluster and make a call from remote client, I receive full cluster topology (25 nodes):
> {code}
>  INFO 02 sty 11:24:38 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 - ISPN004006: New topology: [/10.0.36.150:11311, /10.0.36.134:11311, /10.0.36.102:11311, /10.0.36.110:11311, /10.0.36.142:11311, /10.0.36.140:11311, /10.0.36.132:11311, /10.0.36.120:11311, /10.0.36.116:11311, /10.0.36.104:11311, /10.0.36.118:11311, /10.0.36.136:11311, /10.0.36.128:11311, /10.0.36.108:11311, /10.0.36.144:11311, /10.0.36.126:11311, /10.0.36.138:11311, /10.0.36.114:11311, /10.0.36.148:11311, /10.0.36.130:11311, /10.0.36.106:11311, /10.0.36.122:11311, /10.0.36.124:11311, /10.0.36.146:11311, /10.0.36.112:11311]
>  INFO 02 sty 11:24:38 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New server added(/10.0.36.150:11311), adding to the pool.
>  ...
>  INFO 02 sty 11:24:38 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New server added(/10.0.36.112:11311), adding to the pool.
> {code}
> Next I stop one node (10.0.36.106 in this case) and receive another topology, but not that what I expected:
> {code}
>  INFO 02 sty 11:26:39 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 - ISPN004006: New topology: [/10.0.36.102:11311, /10.0.36.104:11311]
>  INFO 02 sty 11:26:39 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New server added(/10.0.36.102:11311), adding to the pool.
>  INFO 02 sty 11:26:39 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New server added(/10.0.36.104:11311), adding to the pool.
>  INFO 02 sty 11:26:39 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server not in cluster anymore(/10.0.36.148:11311), removing from the pool.
>  ...
>  INFO 02 sty 11:26:39 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server not in cluster anymore(/10.0.36.140:11311), removing from the pool.
> {code}
> The client is seeing only two nodes: the coordinator - 10.0.36.102 and the regular node - 10.0.36.104. 
> Now I start a not running node back. And that's the reported topology:
> {code}
>  INFO 02 sty 11:29:29 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 - ISPN004006: New topology: [/10.0.36.102:11311, /10.0.36.104:11311, /10.0.36.106:11311]
>  INFO 02 sty 11:29:29 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New server added(/10.0.36.102:11311), adding to the pool.
>  INFO 02 sty 11:29:29 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New server added(/10.0.36.104:11311), adding to the pool.
>  INFO 02 sty 11:29:29 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New server added(/10.0.36.106:11311), adding to the pool.
> {code}
> The topology is still not valid. Whatever I do, I never receive the full cluster view, until the restart of all nodes.
> But the worse happens after stopping a coordinator. The client receives an empty topology:
> {code}
>  INFO 02 sty 12:01:15 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 - ISPN004006: New topology: []
>  INFO 02 sty 12:01:15 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server not in cluster anymore(/10.0.36.104:11311), removing from the pool.
>  INFO 02 sty 12:01:15 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server not in cluster anymore(/10.0.36.102:11311), removing from the pool.
>  INFO 02 sty 12:01:15 [main] org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server not in cluster anymore(/10.0.36.106:11311), removing from the pool.
> {code}
> Subsequent calls end with exceptions:
> {code}
>  java.lang.IllegalStateException: We should not reach here!
> 	at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:78)
> 	at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:216)
> 	at org.infinispan.CacheSupport.put(CacheSupport.java:52)
> 	...
> {code}
> Unfortunately this not reliable behaviour of remote client stops me from using HotRod Server on production.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira