[JBoss JIRA] (ISPN-4131) Lock acquired forever with delayed PrepareCommand
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4131?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4131:
-----------------------------------------------
Jiří Bílek <jbilek(a)redhat.com> changed the Status of [bug 1378877|https://bugzilla.redhat.com/show_bug.cgi?id=1378877] from ON_QA to VERIFIED
> Lock acquired forever with delayed PrepareCommand
> -------------------------------------------------
>
> Key: ISPN-4131
> URL: https://issues.jboss.org/browse/ISPN-4131
> Project: Infinispan
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 6.0.2.Final, 7.0.0.Alpha1
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
> Labels: 630betablocker
> Fix For: 7.0.0.Beta1, 7.0.0.Final
>
>
> Distributed transactional cache:
> 1. A sends Prepare to B
> 2. B receives Prepare, but due to ongoing ST it is blocked
> 3. B replication timeout elapses
> 4. B sends Rollback, this does not find the TX as Prepare was not executed yet. The transaction is put into completedTransactions.
> 5. Completed transactions timeout elapses. This is by default 15 seconds, way shorter than ST timeout (due to which the Prepare was blocked)
> 6. Prepare is executed on B, acquiring lock on K
> Nobody will rollback the TX as originator thinks it was already rolled back.
> Result: key K will be locked forever, all attempts to update/remove it will fail.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 2 months
[JBoss JIRA] (ISPN-5729) Remote query - org/apache/lucene/uninverting/UninvertingReader missing on "order by" query
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-5729?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-5729:
--------------------------------
Status: Pull Request Sent (was: Coding In Progress)
Git Pull Request: https://github.com/infinispan/infinispan/pull/4758
> Remote query - org/apache/lucene/uninverting/UninvertingReader missing on "order by" query
> ------------------------------------------------------------------------------------------
>
> Key: ISPN-5729
> URL: https://issues.jboss.org/browse/ISPN-5729
> Project: Infinispan
> Issue Type: Bug
> Components: Remote Querying
> Affects Versions: 8.0.0.Final
> Reporter: Matej Čimbora
> Assignee: Adrian Nistor
> Fix For: 9.0.0.Beta2
>
>
> Consider simple NumberObject class with the following attributes:
> private int integerValue;
> private double doubleValue;
> The following query leads to NoClassDefFoundError on server. Looks like using orderBy causes the problem.
>
> {code}
> QueryBuilder<Query> builder = qf.from(NumberObject.class).orderBy("integerValue").having("integerValue").gte(50).and().having("integerValue").lte(150).toBuilder();
> {code}
> Use the following snippet to load data into cache.
> {code}
> final int numEntries = 200;
> for (int i = 0; i < numEntries; i++) {
> NumberObject no = new NumberObject(i, i);
> manager.cache.put(i, no);
> }
> {code}
> Server exception:
> 12:55:24,158 ERROR [org.infinispan.server.hotrod.CacheDecodeContext] (HotRodServerWorker-9-1) ISPN005009: Unexpected error before any request parameters read: io.netty.handler.codec.DecoderException: java.lang.NoClassDefFoundError: org/apache/lucene/uninverting/UninvertingReader$Type
> at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:425)
> at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:168)
> at org.infinispan.server.hotrod.HotRodDecoder.org$infinispan$server$core$transport$StatsChannelHandler$$super$channelRead(HotRodDecoder.scala:31)
> at org.infinispan.server.core.transport.StatsChannelHandler$class.channelRead(StatsChannelHandler.scala:32)
> at org.infinispan.server.hotrod.HotRodDecoder.channelRead(HotRodDecoder.scala:31)
> at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
> at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
> at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
> at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
> at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NoClassDefFoundError: org/apache/lucene/uninverting/UninvertingReader$Type
> at org.hibernate.search.reader.impl.ManagedMultiReader.getMappings(ManagedMultiReader.java:102)
> at org.hibernate.search.reader.impl.ManagedMultiReader.getEffectiveReaders(ManagedMultiReader.java:68)
> at org.hibernate.search.reader.impl.ManagedMultiReader.createInstance(ManagedMultiReader.java:53)
> at org.hibernate.search.reader.impl.MultiReaderFactory.openReader(MultiReaderFactory.java:52)
> at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:619)
> at org.hibernate.search.query.engine.impl.HSQueryImpl.buildSearcher(HSQueryImpl.java:523)
> at org.hibernate.search.query.engine.impl.HSQueryImpl.queryEntityInfos(HSQueryImpl.java:263)
> at org.infinispan.query.impl.CacheQueryImpl.list(CacheQueryImpl.java:161)
> at org.infinispan.query.dsl.embedded.impl.EmbeddedLuceneQuery.list(EmbeddedLuceneQuery.java:49)
> at org.infinispan.query.remote.impl.QueryFacadeImpl.makeResponse(QueryFacadeImpl.java:84)
> at org.infinispan.query.remote.impl.QueryFacadeImpl.query(QueryFacadeImpl.java:64)
> at org.infinispan.server.hotrod.Decoder2x$.customReadKey(Decoder2x.scala:370)
> at org.infinispan.server.hotrod.HotRodDecoder.customDecodeKey(HotRodDecoder.scala:194)
> at org.infinispan.server.hotrod.HotRodDecoder.org$infinispan$server$hotrod$HotRodDecoder$$decodeKey(HotRodDecoder.scala:104)
> at org.infinispan.server.hotrod.HotRodDecoder$$anonfun$decode$1.apply$mcV$sp(HotRodDecoder.scala:48)
> at org.infinispan.server.hotrod.HotRodDecoder.wrapSecurity(HotRodDecoder.scala:206)
> at org.infinispan.server.hotrod.HotRodDecoder.decode(HotRodDecoder.scala:45)
> at io.netty.handler.codec.ReplayingDecoder.callDecode(ReplayingDecoder.java:370)
> ... 15 more
> Caused by: java.lang.ClassNotFoundException: org.apache.lucene.uninverting.UninvertingReader$Type from [Module "org.hibernate.search.engine:main" from local module loader @11028347 (finder: local module finder @14899482 (roots: /home/mcimbora/Projects/infinispan/infinispan/server/integration/build/target/infinispan-server-8.0.0.Final/modules,/home/mcimbora/Projects/infinispan/infinispan/server/integration/build/target/infinispan-server-8.0.0.Final/modules/system/layers/base))]
> at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:205)
> at org.jboss.modules.ConcurrentClassLoader.performLoadClassUnchecked(ConcurrentClassLoader.java:455)
> at org.jboss.modules.ConcurrentClassLoader.performLoadClassChecked(ConcurrentClassLoader.java:404)
> at org.jboss.modules.ConcurrentClassLoader.performLoadClass(ConcurrentClassLoader.java:385)
> at org.jboss.modules.ConcurrentClassLoader.loadClass(ConcurrentClassLoader.java:130)
> ... 33 more
> Cache configuration
> {code:xml}
> <distributed-cache name="dist_infinispan" mode="SYNC">
> <locking acquire-timeout="3000" concurrency-level="1000"/>
> <transaction mode="NONE"/>
> <indexing index="LOCAL">
> <property name="default.directory_provider">infinispan</property>
> <property name="default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager</property>
> <property name="default.exclusive_index_use">true</property>
> <property name="default.metadata_cachename">lucene_metadata_repl</property>
> <property name="default.data_cachename">lucene_data_repl</property>
> <property name="default.locking_cachename">lucene_locking_repl</property>
> </indexing>
> </distributed-cache>
> <replicated-cache name="lucene_metadata_repl" mode="SYNC"/>
> <replicated-cache name="lucene_locking_repl" mode="SYNC"/>
> <replicated-cache name="lucene_data_repl" mode="SYNC"/>
> {code}
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 2 months
[JBoss JIRA] (ISPN-6599) PutAll operation in the Hot Rod client only partially completed during topology changes
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-6599?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes commented on ISPN-6599:
-----------------------------------------
When the reproducer fails, it's possible to attach jconsole in the servers, and the distribution of keys is uneven: server1 has 998 entries and server2 has 1000 entries (server0 is killed by then).
Creating a Hot Rod client in a shell and calling remoteCache.keySet().size() alternates results between 998 and 1000 entries
> PutAll operation in the Hot Rod client only partially completed during topology changes
> ---------------------------------------------------------------------------------------
>
> Key: ISPN-6599
> URL: https://issues.jboss.org/browse/ISPN-6599
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.2.1.Final, 9.0.0.Alpha1
> Reporter: Gustavo Fernandes
> Assignee: Dan Berindei
> Fix For: 9.0.0.Alpha2, 8.2.2.Final, 9.0.0.Final
>
> Attachments: reproducer.zip
>
>
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 2 months
[JBoss JIRA] (ISPN-6599) PutAll operation in the Hot Rod client only partially completed during topology changes
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-6599?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes edited comment on ISPN-6599 at 1/12/17 3:07 PM:
------------------------------------------------------------------
When the reproducer fails, it's possible to attach jconsole in the servers, and verify via JMX that the distribution of entries is uneven: server1 has 998 entries and server2 has 1000 entries (server0 is killed by then).
Creating a Hot Rod client in a shell and calling remoteCache.keySet().size() alternates results between 998 and 1000 entries
was (Author: gustavonalle):
When the reproducer fails, it's possible to attach jconsole in the servers, and the distribution of keys is uneven: server1 has 998 entries and server2 has 1000 entries (server0 is killed by then).
Creating a Hot Rod client in a shell and calling remoteCache.keySet().size() alternates results between 998 and 1000 entries
> PutAll operation in the Hot Rod client only partially completed during topology changes
> ---------------------------------------------------------------------------------------
>
> Key: ISPN-6599
> URL: https://issues.jboss.org/browse/ISPN-6599
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.2.1.Final, 9.0.0.Alpha1
> Reporter: Gustavo Fernandes
> Assignee: Dan Berindei
> Fix For: 9.0.0.Alpha2, 8.2.2.Final, 9.0.0.Final
>
> Attachments: reproducer.zip
>
>
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 2 months
[JBoss JIRA] (ISPN-5655) MissingFormatArgumentException thrown by PreferConsistencyStrategy
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5655?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-5655:
------------------------------------------
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1412752
Bugzilla Update: Perform
> MissingFormatArgumentException thrown by PreferConsistencyStrategy
> ------------------------------------------------------------------
>
> Key: ISPN-5655
> URL: https://issues.jboss.org/browse/ISPN-5655
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 8.0.0.Beta2
> Reporter: Ryan Emerson
> Assignee: Ryan Emerson
> Fix For: 8.0.0.Beta3, 8.0.0.Final
>
>
> Exception thrown due to line 197 in PreferConsistencyStrategy.java
> 2015-08-03 10:30:38,873 ERROR Unable to format msg: After merge, cache %s has recovered and is entering available mode java.util.MissingFormatArgumentException: Format specifier '%s'
> at java.util.Formatter.format(Formatter.java:2519)
> at java.util.Formatter.format(Formatter.java:2455)
> at java.lang.String.format(String.java:2928)
> at org.apache.logging.log4j.message.StringFormattedMessage.formatMessage(StringFormattedMessage.java:88)
> at org.apache.logging.log4j.message.StringFormattedMessage.getFormattedMessage(StringFormattedMessage.java:60)
> at org.apache.logging.log4j.core.pattern.MessagePatternConverter.format(MessagePatternConverter.java:68)
> at org.apache.logging.log4j.core.pattern.PatternFormatter.format(PatternFormatter.java:36)
> at org.apache.logging.log4j.core.layout.PatternLayout.toSerializable(PatternLayout.java:196)
> at org.apache.logging.log4j.core.layout.PatternLayout.toSerializable(PatternLayout.java:55)
> at org.apache.logging.log4j.core.layout.AbstractStringLayout.toByteArray(AbstractStringLayout.java:71)
> at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:108)
> at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:99)
> at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:430)
> at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:409)
> at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:412)
> at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:367)
> at org.apache.logging.log4j.core.Logger.logMessage(Logger.java:112)
> at org.jboss.logging.Log4j2Logger.doLogf(Log4j2Logger.java:66)
> at org.jboss.logging.Logger.logf(Logger.java:2445)
> at org.jboss.logging.DelegatingBasicLogger.debugf(DelegatingBasicLogger.java:344)
> at org.infinispan.partitionhandling.impl.PreferConsistencyStrategy.onPartitionMerge(PreferConsistencyStrategy.java:198)
> at org.infinispan.topology.ClusterCacheStatus.doMergePartitions(ClusterCacheStatus.java:509)
> at org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:383)
> at org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:380)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at org.infinispan.executors.SemaphoreCompletionService$QueueingTask.runInternal(SemaphoreCompletionService.java:173)
> at org.infinispan.executors.SemaphoreCompletionService$QueueingTask.run(SemaphoreCompletionService.java:151)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 2 months
[JBoss JIRA] (ISPN-6599) PutAll operation in the Hot Rod client only partially completed during topology changes
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-6599?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes updated ISPN-6599:
------------------------------------
Comment: was deleted
(was: I did some debugging with the issue and it seems there's an issue with the way the Triangle handles async acks.
The reproducer does a series of PutMap against a 3 node server, and while it's doing it, one of the servers is killed. One of the entries (key B0x034b0000003b) is not written as a result of it.
Here's the sequence of events for a particular request
1. The Hot Rod client does a put Map
{noformat}
15:38:28,486 TRACE (HotRod-client-async-pool-29:[]) [Codec21] Wrote header for messageId=47 to TcpTransport{socket=Socket[addr=/127.0.0.1,port=13222,localport=52364], serverAddress=/127.0.0.1:13222, id =2} . Operation code: 0x2d. Flags: 0x6. Topology id: 1
15:38:28,486 TRACE (HotRod-client-async-pool-29:[]) [TcpTransport] Flushed socket: Socket[addr=/127.0.0.1,port=13222,localport=52364]
{noformat}
2. Server2 receives the request:
{noformat}
2017-01-12 15:38:28,502 TRACE [org.infinispan.server.hotrod.HotRodDecoder] (HotRodServerWorker-7-1) Decoded header HotRodHeader{op=PUT_ALL, version=25, messageId=47, cacheName='', flag=6, clientIntel=3, topologyId=1}
2017-01-12 15:38:28,503 TRACE [org.infinispan.interceptors.impl.InvocationContextInterceptor] (HotRodServerHandler-10-15) Invoked with command PutMapCommand{map={[B0x034b0000003b=[B0x030409000000226f..[127]}, flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=562954248388620}}, isForwarded=false} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@2550fdca]
2017-01-12 15:38:28,503 TRACE [org.infinispan.statetransfer.StateTransferInterceptor] (HotRodServerHandler-10-15) handleNonTxWriteCommand for command PutMapCommand{map={[B0x034b0000003b=[B0x030409000000226f..[127]}, flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=562954248388620}}, isForwarded=false}, topology id -1
{noformat}
3. server2 forwards the command to server1 (which is the owner of the key), and waits for async acks:
{noformat}
2017-01-12 15:38:28,503 TRACE [org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor] (HotRodServerHandler-10-15) Are (server2) we the lock owners for key '[B0x034b0000003b'? false
2017-01-12 15:38:28,505 TRACE [org.infinispan.util.concurrent.CommandAckCollector] (HotRodServerHandler-10-15) Created new collector for CommandInvocation:server2:41. PrimarySegments=[server1]. BackupSegments
2017-01-12 15:38:28,508 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (HotRodServerHandler-10-15) sendTo: destination=server1, command=SingleRpcCommand{cacheName='default', command=PutMapCommand{map={[B0x034b0000003b=[B0x030409000000226f..[127]}, flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=562954248388620}}, isForwarded=false}}, order=NONE
2017-01-12 15:38:28,508 TRACE [org.jgroups.protocols.UNICAST3] (HotRodServerHandler-10-15) server2 --> DATA(server1: #57, conn_id=2)
2017-01-12 15:38:28,509 TRACE [org.jgroups.protocols.TCP] (HotRodServerHandler-10-15) server2: sending msg to server1, src=server2, headers are RequestCorrelator: corr_id=200, type=REQ, req_id=0, rsp_expected=false, FORK: cluster:clustered, UNICAST3: DATA, seqno=57, conn_id=2, TP: [cluster_name=cluster]
2017-01-12 15:38:28,509 TRACE [org.infinispan.util.concurrent.CommandAckCollector] (HotRodServerHandler-10-15) [Collector#CommandInvocation:server2:41] Waiting for acks asynchronously.
{noformat}
4. Server1 receives the command, but since topology is outdated, sends back an {{ExceptionAckCommand}} to server2
{noformat}
2017-01-12 15:38:28,509 TRACE [org.jgroups.protocols.UNICAST3] (jgroups-4,server1) server1 <-- DATA(server2: #57, conn_id=2)
2017-01-12 15:38:28,510 TRACE [org.infinispan.util.concurrent.locks.impl.DefaultLockManager] (jgroups-4,server1) Lock key=[B0x034b0000003b for owner=CommandInvocation:server2:41. timeout=30000 (MILLISECONDS)
2017-01-12 15:38:28,510 TRACE [org.infinispan.interceptors.TriangleAckInterceptor] (remote-thread--p2-t12) Sending exception ack for command CommandInvocation:server2:41. Originator=server2.
2017-01-12 15:38:28,510 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (remote-thread--p2-t12) server1 invoking ExceptionAckCommand{commandInvocationId=CommandInvocation:server2:41, throwable=org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed while the command was executing: expected 2, got 3, topologyId=2} to server2 ordered by NONE
2017-01-12 15:38:28,508 TRACE [org.jgroups.protocols.UNICAST3] (jgroups-17,server1) server1: delivering server2#56
{noformat}
5. Server2 processes the {{ExceptionAckCommand}}
{noformat}
2017-01-12 15:38:28,570 TRACE [org.infinispan.remoting.inboundhandler.NonTotalOrderPerCacheInboundInvocationHandler] (jgroups-3,server2) Calling perform() on ExceptionAckCommand{commandInvocationId=CommandInvocation:server2:41, throwable=org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed while the command was executing: expected 2, got 3, topologyId=2}
2017-01-12 15:38:28,570 TRACE [org.infinispan.util.concurrent.CommandAckCollector] (jgroups-3,server2) [Collector#CommandInvocation:server2:41] completed exceptionally. TopologyId=2 (expected=2): org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed while the command was executing: expected 2, got 3
{noformat}
6. It seems the command is not retried and the server2 returns success to the Hot Rod client
{noformat}
2017-01-12 15:38:28,582 TRACE [org.infinispan.server.hotrod.HotRodEncoder] (HotRodServerWorker-7-1) Encode msg EmptyResponse{version=25, messageId=47, cacheName='', clientIntel=3, operation=PUT_ALL, status=Success, topologyId=1}
{noformat}
7. The client receives a clean response, without any exception, but the key was not actually written
{noformat}
15:38:28,604 TRACE (HotRod-client-async-pool-29:[]) [Codec21] Received response for messageId=47
{noformat}
)
> PutAll operation in the Hot Rod client only partially completed during topology changes
> ---------------------------------------------------------------------------------------
>
> Key: ISPN-6599
> URL: https://issues.jboss.org/browse/ISPN-6599
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.2.1.Final, 9.0.0.Alpha1
> Reporter: Gustavo Fernandes
> Assignee: Dan Berindei
> Fix For: 9.0.0.Alpha2, 8.2.2.Final, 9.0.0.Final
>
> Attachments: reproducer.zip
>
>
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 2 months
[JBoss JIRA] (ISPN-6599) PutAll operation in the Hot Rod client only partially completed during topology changes
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-6599?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes commented on ISPN-6599:
-----------------------------------------
I did some debugging with the issue and it seems there's an issue with the way the Triangle handles async acks.
The reproducer does a series of PutMap against a 3 node server, and while it's doing it, one of the servers is killed. One of the entries (key B0x034b0000003b) is not written as a result of it.
Here's the sequence of events for a particular request
1. The Hot Rod client does a put Map
{noformat}
15:38:28,486 TRACE (HotRod-client-async-pool-29:[]) [Codec21] Wrote header for messageId=47 to TcpTransport{socket=Socket[addr=/127.0.0.1,port=13222,localport=52364], serverAddress=/127.0.0.1:13222, id =2} . Operation code: 0x2d. Flags: 0x6. Topology id: 1
15:38:28,486 TRACE (HotRod-client-async-pool-29:[]) [TcpTransport] Flushed socket: Socket[addr=/127.0.0.1,port=13222,localport=52364]
{noformat}
2. Server2 receives the request:
{noformat}
2017-01-12 15:38:28,502 TRACE [org.infinispan.server.hotrod.HotRodDecoder] (HotRodServerWorker-7-1) Decoded header HotRodHeader{op=PUT_ALL, version=25, messageId=47, cacheName='', flag=6, clientIntel=3, topologyId=1}
2017-01-12 15:38:28,503 TRACE [org.infinispan.interceptors.impl.InvocationContextInterceptor] (HotRodServerHandler-10-15) Invoked with command PutMapCommand{map={[B0x034b0000003b=[B0x030409000000226f..[127]}, flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=562954248388620}}, isForwarded=false} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@2550fdca]
2017-01-12 15:38:28,503 TRACE [org.infinispan.statetransfer.StateTransferInterceptor] (HotRodServerHandler-10-15) handleNonTxWriteCommand for command PutMapCommand{map={[B0x034b0000003b=[B0x030409000000226f..[127]}, flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=562954248388620}}, isForwarded=false}, topology id -1
{noformat}
3. server2 forwards the command to server1 (which is the owner of the key), and waits for async acks:
{noformat}
2017-01-12 15:38:28,503 TRACE [org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor] (HotRodServerHandler-10-15) Are (server2) we the lock owners for key '[B0x034b0000003b'? false
2017-01-12 15:38:28,505 TRACE [org.infinispan.util.concurrent.CommandAckCollector] (HotRodServerHandler-10-15) Created new collector for CommandInvocation:server2:41. PrimarySegments=[server1]. BackupSegments
2017-01-12 15:38:28,508 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (HotRodServerHandler-10-15) sendTo: destination=server1, command=SingleRpcCommand{cacheName='default', command=PutMapCommand{map={[B0x034b0000003b=[B0x030409000000226f..[127]}, flags=[IGNORE_RETURN_VALUES], metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=NumericVersion{version=562954248388620}}, isForwarded=false}}, order=NONE
2017-01-12 15:38:28,508 TRACE [org.jgroups.protocols.UNICAST3] (HotRodServerHandler-10-15) server2 --> DATA(server1: #57, conn_id=2)
2017-01-12 15:38:28,509 TRACE [org.jgroups.protocols.TCP] (HotRodServerHandler-10-15) server2: sending msg to server1, src=server2, headers are RequestCorrelator: corr_id=200, type=REQ, req_id=0, rsp_expected=false, FORK: cluster:clustered, UNICAST3: DATA, seqno=57, conn_id=2, TP: [cluster_name=cluster]
2017-01-12 15:38:28,509 TRACE [org.infinispan.util.concurrent.CommandAckCollector] (HotRodServerHandler-10-15) [Collector#CommandInvocation:server2:41] Waiting for acks asynchronously.
{noformat}
4. Server1 receives the command, but since topology is outdated, sends back an {{ExceptionAckCommand}} to server2
{noformat}
2017-01-12 15:38:28,509 TRACE [org.jgroups.protocols.UNICAST3] (jgroups-4,server1) server1 <-- DATA(server2: #57, conn_id=2)
2017-01-12 15:38:28,510 TRACE [org.infinispan.util.concurrent.locks.impl.DefaultLockManager] (jgroups-4,server1) Lock key=[B0x034b0000003b for owner=CommandInvocation:server2:41. timeout=30000 (MILLISECONDS)
2017-01-12 15:38:28,510 TRACE [org.infinispan.interceptors.TriangleAckInterceptor] (remote-thread--p2-t12) Sending exception ack for command CommandInvocation:server2:41. Originator=server2.
2017-01-12 15:38:28,510 TRACE [org.infinispan.remoting.rpc.RpcManagerImpl] (remote-thread--p2-t12) server1 invoking ExceptionAckCommand{commandInvocationId=CommandInvocation:server2:41, throwable=org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed while the command was executing: expected 2, got 3, topologyId=2} to server2 ordered by NONE
2017-01-12 15:38:28,508 TRACE [org.jgroups.protocols.UNICAST3] (jgroups-17,server1) server1: delivering server2#56
{noformat}
5. Server2 processes the {{ExceptionAckCommand}}
{noformat}
2017-01-12 15:38:28,570 TRACE [org.infinispan.remoting.inboundhandler.NonTotalOrderPerCacheInboundInvocationHandler] (jgroups-3,server2) Calling perform() on ExceptionAckCommand{commandInvocationId=CommandInvocation:server2:41, throwable=org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed while the command was executing: expected 2, got 3, topologyId=2}
2017-01-12 15:38:28,570 TRACE [org.infinispan.util.concurrent.CommandAckCollector] (jgroups-3,server2) [Collector#CommandInvocation:server2:41] completed exceptionally. TopologyId=2 (expected=2): org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed while the command was executing: expected 2, got 3
{noformat}
6. It seems the command is not retried and the server2 returns success to the Hot Rod client
{noformat}
2017-01-12 15:38:28,582 TRACE [org.infinispan.server.hotrod.HotRodEncoder] (HotRodServerWorker-7-1) Encode msg EmptyResponse{version=25, messageId=47, cacheName='', clientIntel=3, operation=PUT_ALL, status=Success, topologyId=1}
{noformat}
7. The client receives a clean response, without any exception, but the key was not actually written
{noformat}
15:38:28,604 TRACE (HotRod-client-async-pool-29:[]) [Codec21] Received response for messageId=47
{noformat}
> PutAll operation in the Hot Rod client only partially completed during topology changes
> ---------------------------------------------------------------------------------------
>
> Key: ISPN-6599
> URL: https://issues.jboss.org/browse/ISPN-6599
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.2.1.Final, 9.0.0.Alpha1
> Reporter: Gustavo Fernandes
> Assignee: Dan Berindei
> Fix For: 9.0.0.Alpha2, 8.2.2.Final, 9.0.0.Final
>
> Attachments: reproducer.zip
>
>
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 2 months