[infinispan-issues] [JBoss JIRA] (ISPN-1654) Topology view management in Hotrod Server is not reliable

Wednesday, 4 January 2012

    [
https://issues.jboss.org/browse/ISPN-1654?page=com.atlassian.jira.plugin....
] 

Galder Zamarreño commented on ISPN-1654:
----------------------------------------

The logs attached to the forum entry appear to show an issue with the filtering of
members:

{code}2012-01-03 10:13:26,026 TRACE (notification-thread-2)
[org.infinispan.server.hotrod.HotRodServer] View change received on coordinator: 
EventImpl{type=VIEW_CHANGED, newMembers=[n1-16004, n2-22348], oldMembers=[n1-16004,
n3-34263, n2-22348], localAddress=n1-16004, viewId=3, 
subgroupsMerged=null, mergeView=false}
2012-01-03 10:13:26,027 TRACE (notification-thread-2)
[org.infinispan.server.hotrod.HotRodServer] Somone left the cluster, oldMembers=non-empty

iterator newMembers=empty iterator
2012-01-03 10:13:26,028 TRACE (notification-thread-2)
[org.infinispan.server.hotrod.HotRodServer] Remove n3-34263 from address cache
2012-01-03 10:13:26,064 TRACE (notification-thread-2)
[org.infinispan.server.hotrod.HotRodServer] Remove n2-22348 from address cache{code}

n2 should have not been removed, investigating

...
 Topology view management in Hotrod Server is not reliable
 ---------------------------------------------------------

                 Key: ISPN-1654
                 URL: https://issues.jboss.org/browse/ISPN-1654
             Project: Infinispan
          Issue Type: Bug
          Components: Distributed Cache
    Affects Versions: 5.1.0.CR1
            Reporter: Jacek Gerbszt
            Assignee: Galder Zamarreño
            Priority: Blocker
             Fix For: 5.1.0.CR3

         Attachments: client.log.gz, front-21.log.gz, front-24.log.gz

 There is a problem with management of cluster topology view in address cache. You can see
that by using remote client - there is the only way I know to see what is inside the
address cache.
 When I restart the whole cluster and make a call from remote client, I receive full
cluster topology (25 nodes):
 {code}
  INFO 02 sty 11:24:38 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 -
ISPN004006: New topology: [/10.0.36.150:11311, /10.0.36.134:11311, /10.0.36.102:11311,
/10.0.36.110:11311, /10.0.36.142:11311, /10.0.36.140:11311, /10.0.36.132:11311,
/10.0.36.120:11311, /10.0.36.116:11311, /10.0.36.104:11311, /10.0.36.118:11311,
/10.0.36.136:11311, /10.0.36.128:11311, /10.0.36.108:11311, /10.0.36.144:11311,
/10.0.36.126:11311, /10.0.36.138:11311, /10.0.36.114:11311, /10.0.36.148:11311,
/10.0.36.130:11311, /10.0.36.106:11311, /10.0.36.122:11311, /10.0.36.124:11311,
/10.0.36.146:11311, /10.0.36.112:11311]
  INFO 02 sty 11:24:38 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.150:11311), adding to the pool.
  ...
  INFO 02 sty 11:24:38 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.112:11311), adding to the pool.
 {code}
 Next I stop one node (10.0.36.106 in this case) and receive another topology, but not
that what I expected:
 {code}
  INFO 02 sty 11:26:39 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 -
ISPN004006: New topology: [/10.0.36.102:11311, /10.0.36.104:11311]
  INFO 02 sty 11:26:39 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.102:11311), adding to the pool.
  INFO 02 sty 11:26:39 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.104:11311), adding to the pool.
  INFO 02 sty 11:26:39 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.148:11311), removing from the pool.
  ...
  INFO 02 sty 11:26:39 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.140:11311), removing from the pool.
 {code}
 The client is seeing only two nodes: the coordinator - 10.0.36.102 and the regular node -
10.0.36.104. 
 Now I start a not running node back. And that's the reported topology:
 {code}
  INFO 02 sty 11:29:29 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 -
ISPN004006: New topology: [/10.0.36.102:11311, /10.0.36.104:11311, /10.0.36.106:11311]
  INFO 02 sty 11:29:29 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.102:11311), adding to the pool.
  INFO 02 sty 11:29:29 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.104:11311), adding to the pool.
  INFO 02 sty 11:29:29 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004014: New
server added(/10.0.36.106:11311), adding to the pool.
 {code}
 The topology is still not valid. Whatever I do, I never receive the full cluster view,
until the restart of all nodes.
 But the worse happens after stopping a coordinator. The client receives an empty
topology:
 {code}
  INFO 02 sty 12:01:15 [main] org.infinispan.client.hotrod.impl.protocol.Codec11 -
ISPN004006: New topology: []
  INFO 02 sty 12:01:15 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.104:11311), removing from the pool.
  INFO 02 sty 12:01:15 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.102:11311), removing from the pool.
  INFO 02 sty 12:01:15 [main]
org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory - ISPN004016: Server
not in cluster anymore(/10.0.36.106:11311), removing from the pool.
 {code}
 Subsequent calls end with exceptions:
 {code}
  java.lang.IllegalStateException: We should not reach here!
 	at
org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:78)
 	at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:216)
 	at org.infinispan.CacheSupport.put(CacheSupport.java:52)
 	...
 {code}
 Unfortunately this not reliable behaviour of remote client stops me from using HotRod
Server on production. 
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] (ISPN-1654) Topology view management in Hotrod Server is not reliable