]
Galder Zamarreño commented on ISPN-1654:
----------------------------------------
Nice catch guys. It was a problem with the different order. I was using Scala iterators
which returns true to contains only if the element is present and it's in the same
position. The fact that in the old view n3 and n2 were swapped around was causing issues.
Switching to iterables works fine. This is just a problem for 5.1 beta/cr releases which
is when I changed this code.
Topology view management in Hotrod Server is not reliable
---------------------------------------------------------
Key: ISPN-1654
URL:
https://issues.jboss.org/browse/ISPN-1654
Project: Infinispan
Issue Type: Bug
Components: Distributed Cache
Affects Versions: 5.1.0.CR1
Reporter: Jacek Gerbszt
Assignee: Galder Zamarreño
Priority: Blocker
Fix For: 5.1.0.CR3
Attachments: client.log.gz, front-21.log.gz, front-24.log.gz
There is a problem with management of cluster topology view in address cache. You can see
that by using remote client - there is the only way I know to see what is inside the
address cache.
When I restart the whole cluster and make a call from remote client, I receive full
cluster topology (25 nodes):
{code}
INFO 02 sty 11:24:38 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 -
ISPN004006: New topology: [/10.0.36.150:11311, /10.0.36.134:11311, /10.0.36.102:11311,
/10.0.36.110:11311, /10.0.36.142:11311, /10.0.36.140:11311, /10.0.36.132:11311,
/10.0.36.120:11311, /10.0.36.116:11311, /10.0.36.104:11311, /10.0.36.118:11311,
/10.0.36.136:11311, /10.0.36.128:11311, /10.0.36.108:11311, /10.0.36.144:11311,
/10.0.36.126:11311, /10.0.36.138:11311, /10.0.36.114:11311, /10.0.36.148:11311,
/10.0.36.130:11311, /10.0.36.106:11311, /10.0.36.122:11311, /10.0.36.124:11311,
/10.0.36.146:11311, /10.0.36.112:11311]
INFO 02 sty 11:24:38 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.150:11311), adding to the pool.
...
INFO 02 sty 11:24:38 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.112:11311), adding to the pool.
{code}
Next I stop one node (10.0.36.106 in this case) and receive another topology, but not
that what I expected:
{code}
INFO 02 sty 11:26:39 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 -
ISPN004006: New topology: [/10.0.36.102:11311, /10.0.36.104:11311]
INFO 02 sty 11:26:39 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.102:11311), adding to the pool.
INFO 02 sty 11:26:39 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.104:11311), adding to the pool.
INFO 02 sty 11:26:39 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.148:11311), removing from the pool.
...
INFO 02 sty 11:26:39 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.140:11311), removing from the pool.
{code}
The client is seeing only two nodes: the coordinator - 10.0.36.102 and the regular node -
10.0.36.104.
Now I start a not running node back. And that's the reported topology:
{code}
INFO 02 sty 11:29:29 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 -
ISPN004006: New topology: [/10.0.36.102:11311, /10.0.36.104:11311, /10.0.36.106:11311]
INFO 02 sty 11:29:29 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.102:11311), adding to the pool.
INFO 02 sty 11:29:29 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.104:11311), adding to the pool.
INFO 02 sty 11:29:29 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.106:11311), adding to the pool.
{code}
The topology is still not valid. Whatever I do, I never receive the full cluster view,
until the restart of all nodes.
But the worse happens after stopping a coordinator. The client receives an empty
topology:
{code}
INFO 02 sty 12:01:15 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 -
ISPN004006: New topology: []
INFO 02 sty 12:01:15 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.104:11311), removing from the pool.
INFO 02 sty 12:01:15 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.102:11311), removing from the pool.
INFO 02 sty 12:01:15 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.106:11311), removing from the pool.
{code}
Subsequent calls end with exceptions:
{code}
java.lang.IllegalStateException: We should not reach here!
at
org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:78)
at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:216)
at org.infinispan.CacheSupport.put(CacheSupport.java:52)
...
{code}
Unfortunately this not reliable behaviour of remote client stops me from using HotRod
Server on production.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: