[JBoss JIRA] (ISPN-2710) Huge amount of OOB threads during performance test
by Robert Stupp (JIRA)
Robert Stupp created ISPN-2710:
----------------------------------
Summary: Huge amount of OOB threads during performance test
Key: ISPN-2710
URL: https://issues.jboss.org/browse/ISPN-2710
Project: Infinispan
Issue Type: Feature Request
Affects Versions: 5.2.0.CR1
Reporter: Robert Stupp
Assignee: Mircea Markus
While running our performance test (as described in ISPN-2240), two of the four servers are running at 80 to 100% CPU - while the others just run at 10%.
Before that phenomenom a huge amount of threads has been created (all called {{OOB-xxxx}}.
The performance test just reads cached data - there was no cache put operation at that time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-2709) Lib dir in distribution archive does not contain the proper versions for some dependencies
by Adrian Nistor (JIRA)
Adrian Nistor created ISPN-2709:
-----------------------------------
Summary: Lib dir in distribution archive does not contain the proper versions for some dependencies
Key: ISPN-2709
URL: https://issues.jboss.org/browse/ISPN-2709
Project: Infinispan
Issue Type: Bug
Affects Versions: 5.2.0.CR1
Reporter: Adrian Nistor
Assignee: Adrian Nistor
Fix For: 5.2.0.Final
Not all the jars referenced by runtime-classpath.txt files of modules are actually present in the lib dir. In some cases the jar is present but not the needed version. Some of the jars are there but are not actually used.
It all happens because the set of dependencies for runtime-classpath.txt is computed for each individual module while the lib dir in the distro is created by assembly plugin after 'merging' the dependencies of all modules which means that only the highest version will be included. Also, maven dependency plugin is known to miss some dependencies.
To avoid the version problem we could define globally a single version for each of these dependencies in parent pom dependencyManagement and also explicitly add the dependency in the respective modules.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-2240) Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key
by Robert Stupp (JIRA)
[ https://issues.jboss.org/browse/ISPN-2240?page=com.atlassian.jira.plugin.... ]
Robert Stupp commented on ISPN-2240:
------------------------------------
A quick performance test:
There is no measurable benefit or penalty using the vendor's "earthenware" products compared to Infinispan when using simple (non-TX, non-locking), distributed caches - a simple read test.
But there's a slight benefit using Infinispan regarding CPU usage.
> Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key
> ----------------------------------------------------------------------------------------------
>
> Key: ISPN-2240
> URL: https://issues.jboss.org/browse/ISPN-2240
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency
> Affects Versions: 5.1.6.FINAL, 5.1.x
> Reporter: Robert Stupp
> Assignee: Mircea Markus
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ISPN-2240_fix_TimeoutExceptions.patch, somehow.zip
>
>
> Hi,
> I've encountered a lot of TimeoutExceptions just running a load test against an infinispan cluster.
> I tracked down the reason and found out, that the code in org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer#releaseLock() causes these superfluous TimeoutExceptions.
> A small test case (which just prints out timeouts, too late timeouts and "paints" a lot of dots to the console - more dots/second on the console means better throughput ;-)
> In a short test I extended the class ReentrantPerEntryLockContainer and changed the implementation of releaseLock() as follows:
> {noformat}
> public void releaseLock(Object lockOwner, Object key) {
> ReentrantLock l = locks.get(key);
> if (l != null) {
> if (!l.isHeldByCurrentThread())
> throw new IllegalStateException("Lock for [" + key + "] not held by current thread " + Thread.currentThread());
> while (l.isHeldByCurrentThread())
> unlock(l, lockOwner);
> if (!l.hasQueuedThreads())
> locks.remove(key);
> }
> else
> throw new IllegalStateException("No lock for [" + key + ']');
> }
> {noformat}
> The main improvement is that locks are not removed from the concurrent map as long as other threads are waiting on that lock.
> If the lock is removed from the map while other threads are waiting for it, they may run into timeouts and force TimeoutExceptions to the client.
> The above methods "paints more dots per second" - means: it gives a better throughput for concurrent accesses to the same key.
> The re-implemented method should also fix some replication timeout exceptions.
> Please, please add this to 5.1.7, if possible.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-2612) Problem broadcasting CH_UPDATE command
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-2612?page=com.atlassian.jira.plugin.... ]
Work on ISPN-2612 started by Dan Berindei.
> Problem broadcasting CH_UPDATE command
> --------------------------------------
>
> Key: ISPN-2612
> URL: https://issues.jboss.org/browse/ISPN-2612
> Project: Infinispan
> Issue Type: Bug
> Components: RPC
> Affects Versions: 5.2.0.Beta5
> Reporter: Michal Linhard
> Assignee: Dan Berindei
> Priority: Blocker
> Fix For: 5.2.0.CR2
>
> Attachments: session-cluster.xml, test.zip
>
>
> Infinispan 5.2.0.Beta5
> JGroups 3.2.4.Final
> Steps to reproduce (I'm using two virtual interfaces test1, test2)
> 1. Start org.jboss.qa.jdg.Test with -Djgroups.udp.bind_addr=test1 -Djava.net.preferIPv4Stack=true
> 2. wait 10 sec
> 3. Start org.jboss.qa.jdg.Test with -Djgroups.udp.bind_addr=test2 -Djava.net.preferIPv4Stack=true
> After 5 seconds there should be this timeout exception:
> {code}
> 19:42:14,146 WARN [org.infinispan.topology.CacheTopologyControlCommand] (OOB-2,mlinhard-work-37329) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=___defaultcache, type=REBALANCE_CONFIRM, sender=mlinhard-work-47337, joinInfo=null, topologyId=1, currentCH=null, pendingCH=null, throwable=null, viewId=1}
> java.util.concurrent.ExecutionException: org.infinispan.CacheException: org.jgroups.TimeoutException: TimeoutException
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232)
> at java.util.concurrent.FutureTask.get(FutureTask.java:91)
> at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:563)
> at org.infinispan.topology.ClusterTopologyManagerImpl.broadcastConsistentHashUpdate(ClusterTopologyManagerImpl.java:349)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleRebalanceCompleted(ClusterTopologyManagerImpl.java:213)
> at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:160)
> at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:137)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:252)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:219)
> at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:483)
> at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390)
> at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248)
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598)
> at org.jgroups.JChannel.up(JChannel.java:703)
> at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020)
> at org.jgroups.protocols.RSVP.up(RSVP.java:172)
> at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:400)
> at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896)
> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244)
> at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:736)
> at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:414)
> at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:606)
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:143)
> at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:187)
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288)
> at org.jgroups.protocols.MERGE2.up(MERGE2.java:205)
> at org.jgroups.protocols.Discovery.up(Discovery.java:359)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1287)
> at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1850)
> at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1823)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.infinispan.CacheException: org.jgroups.TimeoutException: TimeoutException
> at org.infinispan.util.Util.rewrapAsCacheException(Util.java:532)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommands(CommandAwareRpcDispatcher.java:152)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:518)
> at org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:545)
> at org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:542)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> ... 3 more
> Caused by: org.jgroups.TimeoutException: TimeoutException
> at org.jgroups.util.Promise._getResultWithTimeout(Promise.java:145)
> at org.jgroups.util.Promise.getResultWithTimeout(Promise.java:40)
> at org.jgroups.util.AckCollector.waitForAllAcks(AckCollector.java:93)
> at org.jgroups.protocols.RSVP$Entry.block(RSVP.java:287)
> at org.jgroups.protocols.RSVP.down(RSVP.java:118)
> at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:1025)
> at org.jgroups.JChannel.down(JChannel.java:718)
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:616)
> at org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:173)
> at org.jgroups.blocks.GroupRequest.sendRequest(GroupRequest.java:360)
> at org.jgroups.blocks.GroupRequest.sendRequest(GroupRequest.java:103)
> at org.jgroups.blocks.Request.execute(Request.java:83)
> at org.jgroups.blocks.MessageDispatcher.cast(MessageDispatcher.java:335)
> at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:249)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processCalls(CommandAwareRpcDispatcher.java:330)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommands(CommandAwareRpcDispatcher.java:145)
> ... 8 more
> {code}
> Analysis:
> These are the messages sent after view change:
> {code}
> test1 test2
> <--- JOIN ----
> ---- REBALANCE_START --->
> <--- StateRequestCommand ----
> ---- StateResponseCommand --->
> <--- REBALANCE_CONFIRM ----
> ---- CH_UPDATE --->
> {code}
> The last CH_UPDATE message is broadcast, test2 successfully processes it, but test1 stays in waiting state, because it for some reason awaits response also from itself - local variable entry in the method RSVP.down
> (https://github.com/belaban/JGroups/blob/master/src/org/jgroups/protocols/...)
> contained local address.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-2240) Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key
by Robert Stupp (JIRA)
[ https://issues.jboss.org/browse/ISPN-2240?page=com.atlassian.jira.plugin.... ]
Robert Stupp edited comment on ISPN-2240 at 1/14/13 11:24 AM:
--------------------------------------------------------------
A first result from our test:
* works, if we do not use a {{Cache.putAll}} method (just a lot of {{Cache.put()}} operations)
* still produces {{TimeoutException}} s if we use a {{Cache.putAll}} - but the "original" requests (originating the {{putAll}} do not fail):
{noformat}
2013-01-14 17:08:46,224 WARN [OPERATION] (http-threads - 23 - client:10.80.191.33:9180) FatalException:
org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [...] for requestor [Thread[OOB-48,det-servicelayer-as-51-65400,5,Thread Pools]]! Lock held by [Thread[frw-main-executor-47,5,main]]
at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:217) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLockNoCheck(LockManagerImpl.java:200) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.locking.AbstractLockingInterceptor.lockKey(AbstractLockingInterceptor.java:114) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor.visitPutMapCommand(NonTransactionalLockingInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.AbstractVisitor.visitPutMapCommand(AbstractVisitor.java:82) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.statetransfer.StateTransferInterceptor.handleTopologyAffectedCommand(StateTransferInterceptor.java:216) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.statetransfer.StateTransferInterceptor.handleWriteCommand(StateTransferInterceptor.java:194) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.statetransfer.StateTransferInterceptor.visitPutMapCommand(StateTransferInterceptor.java:141) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutMapCommand(CacheMgmtInterceptor.java:113) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:128) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:92) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.AbstractVisitor.visitPutMapCommand(AbstractVisitor.java:82) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:343) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.remote.BaseRpcInvokingCommand.processVisitableCommand(BaseRpcInvokingCommand.java:61) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.remote.SingleRpcCommand.perform(SingleRpcCommand.java:70) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:101) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:122) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:86) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:245) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:218) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:483) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.JChannel.up(JChannel.java:703) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.RSVP.up(RSVP.java:188) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FRAG2.unfragment(FRAG2.java:302) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FRAG2.up(FRAG2.java:162) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FlowControl.up(FlowControl.java:418) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FlowControl.up(FlowControl.java:400) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:736) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:414) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:645) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.MERGE3.up(MERGE3.java:290) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.Discovery.up(Discovery.java:359) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.TP.passMessageUp(TP.java:1287) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1850) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1823) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_31]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_31]
at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_31]
Caused by: org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [...] for requestor [Thread[OOB-48,det-servicelayer-as-51-65400,5,Thread Pools]]! Lock held by [Thread[frw-main-executor-47,5,main]]
... 55 more
{noformat}
Configuration is:
* JBoss AS 7.1.1
* Infinispan 5.2.0.CR1 (deployed inside EAR lib)
* 4 servers - each part of the infinispan cluster
* 4 load test clients each requesting the same data set, which is splitted into sets of N entries (N = 1 or 50 or 100, etc)
* Each request is randomly distributed to one of the backend servers
* There is some possibility that 2 or more concurrent requests request the same set of entries which are not already caches and added to the cache concurrently
* Cache instances are inquired from {{DefaultCacheManager.getCache()}}
was (Author: snazy):
A first result from our test:
* works, if we do not use a {{Cache.putAll}} method (just a lot of {{Cache.put()}} operations
* still produces {{TimeoutException}}s if we use a {{Cache.putAll}} - but the "original" requests (originating the {{putAll}} do not fail):
{noformat}
2013-01-14 17:08:46,224 WARN [OPERATION] (http-threads - 23 - client:10.80.191.33:9180) FatalException:
org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [...] for requestor [Thread[OOB-48,det-servicelayer-as-51-65400,5,Thread Pools]]! Lock held by [Thread[frw-main-executor-47,5,main]]
at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:217) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLockNoCheck(LockManagerImpl.java:200) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.locking.AbstractLockingInterceptor.lockKey(AbstractLockingInterceptor.java:114) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor.visitPutMapCommand(NonTransactionalLockingInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.AbstractVisitor.visitPutMapCommand(AbstractVisitor.java:82) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.statetransfer.StateTransferInterceptor.handleTopologyAffectedCommand(StateTransferInterceptor.java:216) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.statetransfer.StateTransferInterceptor.handleWriteCommand(StateTransferInterceptor.java:194) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.statetransfer.StateTransferInterceptor.visitPutMapCommand(StateTransferInterceptor.java:141) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutMapCommand(CacheMgmtInterceptor.java:113) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:128) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:92) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.AbstractVisitor.visitPutMapCommand(AbstractVisitor.java:82) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:343) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.remote.BaseRpcInvokingCommand.processVisitableCommand(BaseRpcInvokingCommand.java:61) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.remote.SingleRpcCommand.perform(SingleRpcCommand.java:70) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:101) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:122) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:86) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:245) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:218) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:483) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.JChannel.up(JChannel.java:703) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.RSVP.up(RSVP.java:188) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FRAG2.unfragment(FRAG2.java:302) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FRAG2.up(FRAG2.java:162) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FlowControl.up(FlowControl.java:418) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FlowControl.up(FlowControl.java:400) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:736) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:414) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:645) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.MERGE3.up(MERGE3.java:290) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.Discovery.up(Discovery.java:359) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.TP.passMessageUp(TP.java:1287) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1850) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1823) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_31]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_31]
at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_31]
Caused by: org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [...] for requestor [Thread[OOB-48,det-servicelayer-as-51-65400,5,Thread Pools]]! Lock held by [Thread[frw-main-executor-47,5,main]]
... 55 more
{noformat}
Configuration is:
* JBoss AS 7.1.1
* Infinispan 5.2.0.CR1 (deployed inside EAR lib)
* 4 servers - each part of the infinispan cluster
* 4 load test clients each requesting the same data set, which is splitted into sets of N entries (N = 1 or 50 or 100, etc)
* Each request is randomly distributed to one of the backend servers
* There is some possibility that 2 or more concurrent requests request the same set of entries which are not already caches and added to the cache concurrently
* Cache instances are inquired from {{DefaultCacheManager.getCache()}}
> Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key
> ----------------------------------------------------------------------------------------------
>
> Key: ISPN-2240
> URL: https://issues.jboss.org/browse/ISPN-2240
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency
> Affects Versions: 5.1.6.FINAL, 5.1.x
> Reporter: Robert Stupp
> Assignee: Mircea Markus
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ISPN-2240_fix_TimeoutExceptions.patch, somehow.zip
>
>
> Hi,
> I've encountered a lot of TimeoutExceptions just running a load test against an infinispan cluster.
> I tracked down the reason and found out, that the code in org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer#releaseLock() causes these superfluous TimeoutExceptions.
> A small test case (which just prints out timeouts, too late timeouts and "paints" a lot of dots to the console - more dots/second on the console means better throughput ;-)
> In a short test I extended the class ReentrantPerEntryLockContainer and changed the implementation of releaseLock() as follows:
> {noformat}
> public void releaseLock(Object lockOwner, Object key) {
> ReentrantLock l = locks.get(key);
> if (l != null) {
> if (!l.isHeldByCurrentThread())
> throw new IllegalStateException("Lock for [" + key + "] not held by current thread " + Thread.currentThread());
> while (l.isHeldByCurrentThread())
> unlock(l, lockOwner);
> if (!l.hasQueuedThreads())
> locks.remove(key);
> }
> else
> throw new IllegalStateException("No lock for [" + key + ']');
> }
> {noformat}
> The main improvement is that locks are not removed from the concurrent map as long as other threads are waiting on that lock.
> If the lock is removed from the map while other threads are waiting for it, they may run into timeouts and force TimeoutExceptions to the client.
> The above methods "paints more dots per second" - means: it gives a better throughput for concurrent accesses to the same key.
> The re-implemented method should also fix some replication timeout exceptions.
> Please, please add this to 5.1.7, if possible.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-2240) Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key
by Robert Stupp (JIRA)
[ https://issues.jboss.org/browse/ISPN-2240?page=com.atlassian.jira.plugin.... ]
Robert Stupp commented on ISPN-2240:
------------------------------------
A first result from our test:
* works, if we do not use a {{Cache.putAll}} method (just a lot of {{Cache.put()}} operations
* still produces {{TimeoutException}}s if we use a {{Cache.putAll}} - but the "original" requests (originating the {{putAll}} do not fail):
{noformat}
2013-01-14 17:08:46,224 WARN [OPERATION] (http-threads - 23 - client:10.80.191.33:9180) FatalException:
org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [...] for requestor [Thread[OOB-48,det-servicelayer-as-51-65400,5,Thread Pools]]! Lock held by [Thread[frw-main-executor-47,5,main]]
at org.infinispan.util.concurrent.locks.LockManagerImpl.lock(LockManagerImpl.java:217) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.util.concurrent.locks.LockManagerImpl.acquireLockNoCheck(LockManagerImpl.java:200) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.locking.AbstractLockingInterceptor.lockKey(AbstractLockingInterceptor.java:114) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.locking.NonTransactionalLockingInterceptor.visitPutMapCommand(NonTransactionalLockingInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.AbstractVisitor.visitPutMapCommand(AbstractVisitor.java:82) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.statetransfer.StateTransferInterceptor.handleTopologyAffectedCommand(StateTransferInterceptor.java:216) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.statetransfer.StateTransferInterceptor.handleWriteCommand(StateTransferInterceptor.java:194) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.statetransfer.StateTransferInterceptor.visitPutMapCommand(StateTransferInterceptor.java:141) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.CacheMgmtInterceptor.visitPutMapCommand(CacheMgmtInterceptor.java:113) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:128) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:92) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.AbstractVisitor.visitPutMapCommand(AbstractVisitor.java:82) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.write.PutMapCommand.acceptVisitor(PutMapCommand.java:68) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:343) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.remote.BaseRpcInvokingCommand.processVisitableCommand(BaseRpcInvokingCommand.java:61) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.commands.remote.SingleRpcCommand.perform(SingleRpcCommand.java:70) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:101) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleWithWaitForBlocks(InboundInvocationHandlerImpl.java:122) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.InboundInvocationHandlerImpl.handle(InboundInvocationHandlerImpl.java:86) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:245) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:218) [infinispan-core-5.2.0.CR1.jar:5.2.0.CR1]
at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:483) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.JChannel.up(JChannel.java:703) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.RSVP.up(RSVP.java:188) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FRAG2.unfragment(FRAG2.java:302) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FRAG2.up(FRAG2.java:162) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FlowControl.up(FlowControl.java:418) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FlowControl.up(FlowControl.java:400) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:736) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:414) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:645) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.MERGE3.up(MERGE3.java:290) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.Discovery.up(Discovery.java:359) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.TP.passMessageUp(TP.java:1287) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1850) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1823) [jgroups-3.2.5.Final.jar:3.2.5.Final]
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_31]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_31]
at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_31]
Caused by: org.infinispan.util.concurrent.TimeoutException: Unable to acquire lock after [10 seconds] on key [...] for requestor [Thread[OOB-48,det-servicelayer-as-51-65400,5,Thread Pools]]! Lock held by [Thread[frw-main-executor-47,5,main]]
... 55 more
{noformat}
Configuration is:
* JBoss AS 7.1.1
* Infinispan 5.2.0.CR1 (deployed inside EAR lib)
* 4 servers - each part of the infinispan cluster
* 4 load test clients each requesting the same data set, which is splitted into sets of N entries (N = 1 or 50 or 100, etc)
* Each request is randomly distributed to one of the backend servers
* There is some possibility that 2 or more concurrent requests request the same set of entries which are not already caches and added to the cache concurrently
* Cache instances are inquired from {{DefaultCacheManager.getCache()}}
> Per-key lock container leads to superfluous TimeoutExceptions on concurrent access to same key
> ----------------------------------------------------------------------------------------------
>
> Key: ISPN-2240
> URL: https://issues.jboss.org/browse/ISPN-2240
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency
> Affects Versions: 5.1.6.FINAL, 5.1.x
> Reporter: Robert Stupp
> Assignee: Mircea Markus
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ISPN-2240_fix_TimeoutExceptions.patch, somehow.zip
>
>
> Hi,
> I've encountered a lot of TimeoutExceptions just running a load test against an infinispan cluster.
> I tracked down the reason and found out, that the code in org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer#releaseLock() causes these superfluous TimeoutExceptions.
> A small test case (which just prints out timeouts, too late timeouts and "paints" a lot of dots to the console - more dots/second on the console means better throughput ;-)
> In a short test I extended the class ReentrantPerEntryLockContainer and changed the implementation of releaseLock() as follows:
> {noformat}
> public void releaseLock(Object lockOwner, Object key) {
> ReentrantLock l = locks.get(key);
> if (l != null) {
> if (!l.isHeldByCurrentThread())
> throw new IllegalStateException("Lock for [" + key + "] not held by current thread " + Thread.currentThread());
> while (l.isHeldByCurrentThread())
> unlock(l, lockOwner);
> if (!l.hasQueuedThreads())
> locks.remove(key);
> }
> else
> throw new IllegalStateException("No lock for [" + key + ']');
> }
> {noformat}
> The main improvement is that locks are not removed from the concurrent map as long as other threads are waiting on that lock.
> If the lock is removed from the map while other threads are waiting for it, they may run into timeouts and force TimeoutExceptions to the client.
> The above methods "paints more dots per second" - means: it gives a better throughput for concurrent accesses to the same key.
> The re-implemented method should also fix some replication timeout exceptions.
> Please, please add this to 5.1.7, if possible.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months
[JBoss JIRA] (ISPN-2578) Two PrepareCommands in parallel cause ConcurrentModificationException
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-2578?page=com.atlassian.jira.plugin.... ]
Adrian Nistor commented on ISPN-2578:
-------------------------------------
This issue is now possible because of NBST command forwarding. Previously it was impossible to have two commands for the same TX being executed in parallel, so no thread safety was taken into consideration. This leads now to many issues and the one described here is just one instance, so maybe we should try to solve the general case: avoid two TX commands for same global tx id entering the interceptor chain simultaneously, possibly by synchronizing on the cache tx object prior to this. Or maybe just take the simple route of using synchronized collections for the internals of the tx object? Not sure which one is best, but certainly I would not apply a local fix for this ConcurrentModificationException.
> Two PrepareCommands in parallel cause ConcurrentModificationException
> ---------------------------------------------------------------------
>
> Key: ISPN-2578
> URL: https://issues.jboss.org/browse/ISPN-2578
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.Beta5
> Reporter: Radim Vansa
> Assignee: Mircea Markus
> Priority: Blocker
> Fix For: 5.2.0.CR2
>
>
> Situation:
> 1) Node A broadcasts PrepareCommand to nodes B, C
> 2) Node A leaves cluster, causing new topology to be installed
> 3) The command arrives to B and C, with lower topology than the current one
> 4) Both B and C forward the command to node D
> 5) D executes the two commands in parallel and finds out that A has left, therefore executing RollbackCommand
> In {{AbstractTxLockingInterceptor.visitRollbackCommand}} we call {{LockManagerImpl.unlockAll}} which iterates over the keys and unlocks them. As these two prepares aren't synchronized over the {{lockedKeys}} set, one may unlock and remove these keys while the other is iterating through them, causing {{ConcurrentModificationException}}.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
13 years, 2 months