[
https://issues.jboss.org/browse/ISPN-6599?page=com.atlassian.jira.plugin....
]
Gustavo Fernandes commented on ISPN-6599:
-----------------------------------------
I did some debugging with the issue and it seems there's an issue with the way the
Triangle handles async acks.
The reproducer does a series of PutMap against a 3 node server, and while it's doing
it, one of the servers is killed. One of the entries (key B0x034b0000003b) is not written
as a result of it.
Here's the sequence of events for a particular request
1. The Hot Rod client does a put Map
{noformat}
15:38:28,486 TRACE (HotRod-client-async-pool-29:[]) [Codec21] Wrote header for
messageId=47 to TcpTransport{socket=Socket[addr=/127.0.0.1,port=13222,localport=52364],
serverAddress=/127.0.0.1:13222, id =2} . Operation code: 0x2d. Flags: 0x6. Topology id: 1
15:38:28,486 TRACE (HotRod-client-async-pool-29:[]) [TcpTransport] Flushed socket:
Socket[addr=/127.0.0.1,port=13222,localport=52364]
{noformat}
2. Server2 receives the request:
{noformat}
2017-01-12 15:38:28,502 TRACE [org.infinispan.server.hotrod.HotRodDecoder]
(HotRodServerWorker-7-1) Decoded header HotRodHeader{op=PUT_ALL, version=25, messageId=47,
cacheName='', flag=6, clientIntel=3, topologyId=1}
2017-01-12 15:38:28,503 TRACE
[org.infinispan.interceptors.impl.InvocationContextInterceptor]
(HotRodServerHandler-10-15) Invoked with command
PutMapCommand{map={[B0x034b0000003b=[B0x030409000000226f..[127]},
flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1,
version=NumericVersion{version=562954248388620}}, isForwarded=false} and InvocationContext
[org.infinispan.context.SingleKeyNonTxInvocationContext@2550fdca]
2017-01-12 15:38:28,503 TRACE [org.infinispan.statetransfer.StateTransferInterceptor]
(HotRodServerHandler-10-15) handleNonTxWriteCommand for command
PutMapCommand{map={[B0x034b0000003b=[B0x030409000000226f..[127]},
flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1,
version=NumericVersion{version=562954248388620}}, isForwarded=false}, topology id -1
{noformat}
3. server2 forwards the command to server1 (which is the owner of the key), and waits for
async acks:
{noformat}
2017-01-12 15:38:28,503 TRACE
[org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor]
(HotRodServerHandler-10-15) Are (server2) we the lock owners for key
'[B0x034b0000003b'? false
2017-01-12 15:38:28,505 TRACE [org.infinispan.util.concurrent.CommandAckCollector]
(HotRodServerHandler-10-15) Created new collector for CommandInvocation:server2:41.
PrimarySegments=[server1]. BackupSegments
2017-01-12 15:38:28,508 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport]
(HotRodServerHandler-10-15) sendTo: destination=server1,
command=SingleRpcCommand{cacheName='default',
command=PutMapCommand{map={[B0x034b0000003b=[B0x030409000000226f..[127]},
flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1,
version=NumericVersion{version=562954248388620}}, isForwarded=false}}, order=NONE
2017-01-12 15:38:28,508 TRACE [org.jgroups.protocols.UNICAST3] (HotRodServerHandler-10-15)
server2 --> DATA(server1: #57, conn_id=2)
2017-01-12 15:38:28,509 TRACE [org.jgroups.protocols.TCP] (HotRodServerHandler-10-15)
server2: sending msg to server1, src=server2, headers are RequestCorrelator: corr_id=200,
type=REQ, req_id=0, rsp_expected=false, FORK: cluster:clustered, UNICAST3: DATA, seqno=57,
conn_id=2, TP: [cluster_name=cluster]
2017-01-12 15:38:28,509 TRACE [org.infinispan.util.concurrent.CommandAckCollector]
(HotRodServerHandler-10-15) [Collector#CommandInvocation:server2:41] Waiting for acks
asynchronously.
{noformat}
4. Server1 receives the command, but since topology is outdated, sends back an
{{ExceptionAckCommand}} to server2
{noformat}
2017-01-12 15:38:28,509 TRACE [org.jgroups.protocols.UNICAST3] (jgroups-4,server1) server1
<-- DATA(server2: #57, conn_id=2)
2017-01-12 15:38:28,510 TRACE
[org.infinispan.util.concurrent.locks.impl.DefaultLockManager] (jgroups-4,server1) Lock
key=[B0x034b0000003b for owner=CommandInvocation:server2:41. timeout=30000 (MILLISECONDS)
2017-01-12 15:38:28,510 TRACE [org.infinispan.interceptors.TriangleAckInterceptor]
(remote-thread--p2-t12) Sending exception ack for command CommandInvocation:server2:41.
Originator=server2.
2017-01-12 15:38:28,510 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl]
(remote-thread--p2-t12) server1 invoking
ExceptionAckCommand{commandInvocationId=CommandInvocation:server2:41,
throwable=org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed
while the command was executing: expected 2, got 3, topologyId=2} to server2 ordered by
NONE
2017-01-12 15:38:28,508 TRACE [org.jgroups.protocols.UNICAST3] (jgroups-17,server1)
server1: delivering server2#56
{noformat}
5. Server2 processes the {{ExceptionAckCommand}}
{noformat}
2017-01-12 15:38:28,570 TRACE
[org.infinispan.remoting.inboundhandler.NonTotalOrderPerCacheInboundInvocationHandler]
(jgroups-3,server2) Calling perform() on
ExceptionAckCommand{commandInvocationId=CommandInvocation:server2:41,
throwable=org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed
while the command was executing: expected 2, got 3, topologyId=2}
2017-01-12 15:38:28,570 TRACE [org.infinispan.util.concurrent.CommandAckCollector]
(jgroups-3,server2) [Collector#CommandInvocation:server2:41] completed exceptionally.
TopologyId=2 (expected=2): org.infinispan.statetransfer.OutdatedTopologyException: Cache
topology changed while the command was executing: expected 2, got 3
{noformat}
6. It seems the command is not retried and the server2 returns success to the Hot Rod
client
{noformat}
2017-01-12 15:38:28,582 TRACE [org.infinispan.server.hotrod.HotRodEncoder]
(HotRodServerWorker-7-1) Encode msg EmptyResponse{version=25, messageId=47,
cacheName='', clientIntel=3, operation=PUT_ALL, status=Success, topologyId=1}
{noformat}
7. The client receives a clean response, without any exception, but the key was not
actually written
{noformat}
15:38:28,604 TRACE (HotRod-client-async-pool-29:[]) [Codec21] Received response for
messageId=47
{noformat}
PutAll operation in the Hot Rod client only partially completed
during topology changes
---------------------------------------------------------------------------------------
Key: ISPN-6599
URL:
https://issues.jboss.org/browse/ISPN-6599
Project: Infinispan
Issue Type: Bug
Components: Server
Affects Versions: 8.2.1.Final, 9.0.0.Alpha1
Reporter: Gustavo Fernandes
Assignee: Dan Berindei
Fix For: 9.0.0.Alpha2, 8.2.2.Final, 9.0.0.Final
Attachments: reproducer.zip
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)