[JBoss JIRA] (ISPN-8889) Data race in NonTxInvocationContext
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-8889?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-8889:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/6501
> Data race in NonTxInvocationContext
> -----------------------------------
>
> Key: ISPN-8889
> URL: https://issues.jboss.org/browse/ISPN-8889
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.1.6.Final, 9.2.0.CR2
> Environment: Java 8 (Oracle JDK 8), Solaris SPARC
> Reporter: Peter Levart
> Assignee: Dan Berindei
> Priority: Major
> Labels: data-race
> Attachments: DataRacer.java, DataRacer.java
>
>
> Got the following exceptions starting up an Infinispan node while joining the cluster:
> {noformat}
> 17:10:59.012 [remote-thread--p2-t8] ERROR org.infinispan.interceptors.impl.InvocationContextInterceptor - ISPN000136: Error executing command PutMapCommand, writing keys [11906696627, 11906696626, 11906696625, 11906696624, 11906696631, 11906696630, 11906696629, 11906696628...<9992 other elements>]
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
> at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1832) ~[?:1.8.0_162]
> at java.util.HashMap$TreeNode.treeify(HashMap.java:1949) ~[?:1.8.0_162]
> at java.util.HashMap$TreeNode.split(HashMap.java:2175) ~[?:1.8.0_162]
> at java.util.HashMap.resize(HashMap.java:714) ~[?:1.8.0_162]
> at java.util.HashMap.putVal(HashMap.java:663) ~[?:1.8.0_162]
> at java.util.HashMap.put(HashMap.java:612) ~[?:1.8.0_162]
> at org.infinispan.context.impl.NonTxInvocationContext.putLookedUpEntry(NonTxInvocationContext.java:48) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.container.EntryFactoryImpl.wrapExternalEntry(EntryFactoryImpl.java:143) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.wrapRemoteEntry(BaseDistributionInterceptor.java:222) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.lambda$remoteGet$1(BaseDistributionInterceptor.java:192) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656) ~[?:1.8.0_162]
> at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632) ~[?:1.8.0_162]
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_162]
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) ~[?:1.8.0_162]
> at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:66) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onResponse(StaggeredRequest.java:50) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:53) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1302) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1205) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$200(JGroupsTransport.java:123) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.receive(JGroupsTransport.java:1340) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.jgroups.JChannel.up(JChannel.java:819) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:893) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FRAG3.up(FRAG3.java:171) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:343) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:343) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.pbcast.GMS.up(GMS.java:864) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1002) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:728) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:383) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:600) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:119) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:199) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:252) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.MERGE3.up(MERGE3.java:276) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.Discovery.up(Discovery.java:267) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1248) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_162]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_162]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
> {noformat}
> Immediately suspected that plain HashMap is being modified from multiple threads without proper synchronization. But since I'm new to Infinispan, I can't know whether the usage of plain HashMap in NonTxInvocationContext is intentional and Infinispan code is relying on some external synchronization to be performed by other parts of system that calls NonTxInvocationContext.[lookupEntry|removeLookedUpEntry|putLookedUpEntry|getLookedUpEntries]. In any way, if this external synchronization is taking place, it has a bug and allows calling these methods concurrently without proper synchronization.
> I did some experiments to prove that and also to obtain stack traces of at least two of involved threads, but trivial things that I tried instrumenting those methods all involved some kind of synchronization and the symptom disappeared. I had to be extra smart to obtain the stack traces of two threads without inter-thread synchronization so that I could detect the data race. Here's how I instrumented NonTxInvocationContext code:
> {noformat}
> Index: core/src/main/java/org/infinispan/context/impl/NonTxInvocationContext.java
> IDEA additional info:
> Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
> <+>UTF-8
> ===================================================================
> --- core/src/main/java/org/infinispan/context/impl/NonTxInvocationContext.java (revision 6014e751c20daba1f00e23168281e02afb209905)
> +++ core/src/main/java/org/infinispan/context/impl/NonTxInvocationContext.java (date 1519744397000)
> @@ -22,6 +22,7 @@
> private Set<Object> lockedKeys;
> private Object lockOwner;
>
> + private final DataRacer dataRacer = new DataRacer();
>
> public NonTxInvocationContext(int numEntries, Address origin) {
> super(origin);
> @@ -35,22 +36,26 @@
>
> @Override
> public CacheEntry lookupEntry(Object k) {
> + dataRacer.detect();
> return lookedUpEntries.get(k);
> }
>
> @Override
> public void removeLookedUpEntry(Object key) {
> + dataRacer.detectAndWrite();
> lookedUpEntries.remove(key);
> }
>
> @Override
> public void putLookedUpEntry(Object key, CacheEntry e) {
> + dataRacer.detectAndWrite();
> lookedUpEntries.put(key, e);
> }
>
> @Override
> @SuppressWarnings("unchecked")
> public Map<Object, CacheEntry> getLookedUpEntries() {
> + dataRacer.detect();
> return (Map<Object, CacheEntry>)
> (lookedUpEntries == null ?
> Collections.emptyMap() : lookedUpEntries);
> {noformat}
> Attached, you will find the DataRacer source. While running with such patched Infinispan, I got some of the following:
> {noformat}
> java.lang.IllegalThreadStateException: Thread[remote-thread--p2-t4,5,main]: data race detected with writer of state: 271f7814a414 - see suppressed for writer thread stack trace
> at org.infinispan.context.impl.DataRacer.detect(DataRacer.java:37) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.context.impl.NonTxInvocationContext.lookupEntry(NonTxInvocationContext.java:39) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.distribution.NonTxDistributionInterceptor.addRemoteGet(NonTxDistributionInterceptor.java:405) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.distribution.NonTxDistributionInterceptor.handleRemoteReadWriteManyCommand(NonTxDistributionInterceptor.java:385) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.distribution.NonTxDistributionInterceptor.handleReadWriteManyCommand(NonTxDistributionInterceptor.java:303) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.distribution.NonTxDistributionInterceptor.visitPutMapCommand(NonTxDistributionInterceptor.java:143) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:80) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.BaseAsyncInterceptor.invokeNextThenAccept(BaseAsyncInterceptor.java:98) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.impl.EntryWrappingInterceptor.setSkipRemoteGetsAndInvokeNextForManyEntriesCommand(EntryWrappingInterceptor.java:614) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.impl.EntryWrappingInterceptor.visitPutMapCommand(EntryWrappingInterceptor.java:385) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:80) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.BaseAsyncInterceptor.invokeNext(BaseAsyncInterceptor.java:54) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor.handleWriteManyCommand(NonTransactionalLockingInterceptor.java:53) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.locking.AbstractLockingInterceptor.visitPutMapCommand(AbstractLockingInterceptor.java:179) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:80) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.BaseAsyncInterceptor.invokeNext(BaseAsyncInterceptor.java:54) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.statetransfer.StateTransferInterceptor.handleNonTxWriteCommand(StateTransferInterceptor.java:306) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.statetransfer.StateTransferInterceptor.handleWriteCommand(StateTransferInterceptor.java:252) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.statetransfer.StateTransferInterceptor.visitPutMapCommand(StateTransferInterceptor.java:102) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:80) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.BaseAsyncInterceptor.invokeNext(BaseAsyncInterceptor.java:54) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.impl.CacheMgmtInterceptor.visitPutMapCommand(CacheMgmtInterceptor.java:154) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:80) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.BaseAsyncInterceptor.invokeNextAndExceptionally(BaseAsyncInterceptor.java:123) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.impl.InvocationContextInterceptor.visitCommand(InvocationContextInterceptor.java:90) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.BaseAsyncInterceptor.invokeNext(BaseAsyncInterceptor.java:56) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.DDAsyncInterceptor.handleDefault(DDAsyncInterceptor.java:54) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.DDAsyncInterceptor.visitPutMapCommand(DDAsyncInterceptor.java:90) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:80) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.DDAsyncInterceptor.visitCommand(DDAsyncInterceptor.java:50) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invokeAsync(AsyncInterceptorChainImpl.java:234) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.commands.remote.BaseRpcInvokingCommand.processVisitableCommandAsync(BaseRpcInvokingCommand.java:63) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.commands.remote.SingleRpcCommand.invokeAsync(SingleRpcCommand.java:57) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.inboundhandler.BasePerCacheInboundInvocationHandler.invokeCommand(BasePerCacheInboundInvocationHandler.java:94) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.invoke(BaseBlockingRunnable.java:99) [infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.runAsync(BaseBlockingRunnable.java:71) [infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.run(BaseBlockingRunnable.java:40) [infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_162]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_162]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
> Suppressed: org.infinispan.context.impl.DataRacer$StackTrace: Thread[jgroups-13,tbd2-40757,5,main], state: 271f7814a414
> at org.infinispan.context.impl.DataRacer.write(DataRacer.java:59) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.context.impl.DataRacer.detectAndWrite(DataRacer.java:49) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.context.impl.NonTxInvocationContext.putLookedUpEntry(NonTxInvocationContext.java:51) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.container.EntryFactoryImpl.wrapExternalEntry(EntryFactoryImpl.java:143) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.wrapRemoteEntry(BaseDistributionInterceptor.java:222) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.lambda$remoteGet$1(BaseDistributionInterceptor.java:192) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656) ~[?:1.8.0_162]
> at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632) ~[?:1.8.0_162]
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[?:1.8.0_162]
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) ~[?:1.8.0_162]
> at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:66) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.impl.MultiTargetRequest.onResponse(MultiTargetRequest.java:102) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.StaggeredRequest.onResponse(StaggeredRequest.java:50) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:53) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1302) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1205) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$200(JGroupsTransport.java:123) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.receive(JGroupsTransport.java:1340) ~[infinispan-core-9.2.0-SNAPSHOT.jar:9.2.0-SNAPSHOT]
> at org.jgroups.JChannel.up(JChannel.java:819) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:893) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FRAG3.up(FRAG3.java:171) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:343) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:343) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.pbcast.GMS.up(GMS.java:864) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1002) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:728) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:383) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:600) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:119) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:199) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:252) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.MERGE3.up(MERGE3.java:276) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.Discovery.up(Discovery.java:267) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1248) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87) ~[jgroups-4.0.10.Final.jar:4.0.10.Final]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_162]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_162]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
> {noformat}
> So here you have it. Either using plain HashMap in NonTxInvocationContext is wrong and ConcurrentHashMap should be used instead or some external synchronization has a bug and is inadequate. I hope this helps to fix it.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9817) OOM Error on ExposedByteArrayOutputStream
by Rakesh Vende (Jira)
Rakesh Vende created ISPN-9817:
----------------------------------
Summary: OOM Error on ExposedByteArrayOutputStream
Key: ISPN-9817
URL: https://issues.jboss.org/browse/ISPN-9817
Project: Infinispan
Issue Type: Bug
Affects Versions: 7.2.4.Final
Reporter: Rakesh Vende
Fix For: 7.2.4.Final
Attachments: 11.jpg
Titile - OOM Error on ExposedByteArrayOutputStream
Data -
1. Replication Mode is Async
2. queue-size="500"
3. queue-flush-interval="10000"
Details -
1. Application threads frequently calling put method on replicated cache results in calling flush method of ReplicationQueueImpl.java
2. This cause application threads to wait for every 500th put call to complete the cache replication from the queue
3. This becomes kind of sync replication which blocks application threads.
4. To avoid this situation, we can increase the queue size large enough, which, apparently, does not have any side effect as queue is linked blocking queue and application threads will only get blocked when queue becomes full.
5. However this puts pressure on aysnc queue, which has to replicate entire queue at once.
_replicationQueue-thread--p4-t1 tid=119 [RUNNABLE] [DAEMON] <--- OutOfMemoryError happened in this thread
java.lang.OutOfMemoryError.<init>() OutOfMemoryError.java:48
org.infinispan.commons.io.ExposedByteArrayOutputStream.write(byte[], int, int) ExposedByteArrayOutputStream.java:71
_
6. This out of memory happens when JVM fails to allocate continuations chunk of memory in the form of array of 1 or 2 GB
Summary - If we set queue size to normal or low level, application threads result in calling flush which turns out to be sync replication which blocks other application threads. And, if I increase the queue size to maximum enough so as to avoid sync flush then replication queue throws OOM
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9745) Release fails because compatibility bundle modules don't have source jars
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-9745?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-9745:
-------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 10.0.0.Alpha1
9.4.2.Final
(was: 10.0.0.Beta1)
(was: 9.4.5.Final)
Resolution: Done
> Release fails because compatibility bundle modules don't have source jars
> -------------------------------------------------------------------------
>
> Key: ISPN-9745
> URL: https://issues.jboss.org/browse/ISPN-9745
> Project: Infinispan
> Issue Type: Bug
> Components: Build
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 10.0.0.Alpha1, 9.4.2.Final
>
>
> Modules {{client/marshaller/protostuff/protostuff-marshaller-compatibility-bundle}} and {{client/marshaller/protostuff/protostuff-marshaller-compatibility-bundle}} are uber-jars and don't have any sources.
> Prior to the ISPN-9697 changes, a {{DEPENDENCIES.txt}} and {{LICENSES.txt}} was generated in the source directory, and a test-sources jar was created. But after those files were removed, a test-sources jar is no longer generated, and Nexus rejects the artifacts:
> {noformat}
> 11:19:37.492 [ERROR]
> 11:19:37.492 [ERROR] Nexus Staging Rules Failure Report
> 11:19:37.492 [ERROR] ==================================
> 11:19:37.492 [ERROR]
> 11:19:37.493 [ERROR] Repository "jboss_releases_staging_profile-14551" failures
> 11:19:37.493 [ERROR] Rule "sources-staging" failures
> 11:19:37.493 [ERROR] * Missing: no sources jar found in folder '/org/infinispan/infinispan-marshaller-kryo-bundle/9.4.2.Final'
> 11:19:37.493 [ERROR] * Missing: no sources jar found in folder '/org/infinispan/infinispan-marshaller-protostuff-bundle/9.4.2.Final'
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9816) Handle non segmented container/store for publisher more efficiently
by William Burns (Jira)
[ https://issues.jboss.org/browse/ISPN-9816?page=com.atlassian.jira.plugin.... ]
William Burns updated ISPN-9816:
--------------------------------
Fix Version/s: 10.0.0.Final
> Handle non segmented container/store for publisher more efficiently
> -------------------------------------------------------------------
>
> Key: ISPN-9816
> URL: https://issues.jboss.org/browse/ISPN-9816
> Project: Infinispan
> Issue Type: Sub-task
> Components: Publisher
> Reporter: William Burns
> Assignee: William Burns
> Priority: Major
> Fix For: 10.0.0.Final
>
>
> The new Publisher is designed to take into account segmented data container and segmented stores. However if a store/data container is not segmented, the handling can cause performance issues, although it would still behave better memory and rehash based. The tradeoff is probably fine for data container, however stores performance drop would be massive. We need to process all segments in at least the non segmented store case to retain our old performance.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9816) Handle non segmented container/store for publisher more efficiently
by William Burns (Jira)
William Burns created ISPN-9816:
-----------------------------------
Summary: Handle non segmented container/store for publisher more efficiently
Key: ISPN-9816
URL: https://issues.jboss.org/browse/ISPN-9816
Project: Infinispan
Issue Type: Sub-task
Components: Publisher
Reporter: William Burns
Assignee: William Burns
The new Publisher is designed to take into account segmented data container and segmented stores. However if a store/data container is not segmented, the handling can cause performance issues, although it would still behave better memory and rehash based. The tradeoff is probably fine for data container, however stores performance drop would be massive. We need to process all segments in at least the non segmented store case to retain our old performance.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months