[jboss-jira] [JBoss JIRA] (WFLY-6877) Fine session attributes (i.e. replication-granularity = ATTRIBUTE) cause org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds

Sammy Chu (JIRA) issues at jboss.org
Sun Jul 24 23:35:00 EDT 2016


     [ https://issues.jboss.org/browse/WFLY-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sammy Chu updated WFLY-6877:
----------------------------
    Description: 
Our system using the following infinispan configuration for web session cache (i.e. an async, non-transactional, batch, optimistic locking dustrubutable cache):
{code:xml}
<cache-container name="web" default-cache="dist" module="org.wildfly.clustering.web.infinispan">
	<transport lock-timeout="60000"/>
	<distributed-cache name="dist" mode="ASYNC" l1-lifespan="0" owners="2">
		<locking isolation="REPEATABLE_READ"/>
		<transaction locking="OPTIMISTIC" mode="BATCH"/>
		<file-store/>
	</distributed-cache>
</cache-container>
{code}

When we perform a load test for one of our business use case in our system, Wildfly always throw the infamous locking for 15 seconds timeout exception:
{noformat}
2016-07-21 13:01:52,975 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (default task-33) {batch38003 at e.com} ISPN000136: Error executing command PrepareCommand, writing keys [4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY]: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds for key 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->com.ecv.component.profile.User and requestor GlobalTransaction:<10.1.3.29>:492:local. Lock is held by GlobalTransaction:<10.1.3.29>:458:local
	at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.lock(DefaultLockManager.java:236)
	at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$CompositeLockPromise.lock(DefaultLockManager.java:305)
	at org.infinispan.interceptors.locking.AbstractLockingInterceptor.lockAllAndRecord(AbstractLockingInterceptor.java:199)
	at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.checkPendingAndLockAllKeys(AbstractTxLockingInterceptor.java:199)
	at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAllOrRegisterBackupLock(AbstractTxLockingInterceptor.java:165)
	at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.visitPrepareCommand(OptimisticLockingInterceptor.java:69)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.interceptors.TxInterceptor.invokeNextInterceptorAndVerifyTransaction(TxInterceptor.java:157)
	at org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:144)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:113)
	at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:112)
	at org.infinispan.statetransfer.TransactionSynchronizerInterceptor.visitPrepareCommand(TransactionSynchronizerInterceptor.java:42)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.statetransfer.StateTransferInterceptor.handleTxCommand(StateTransferInterceptor.java:238)
	at org.infinispan.statetransfer.StateTransferInterceptor.visitPrepareCommand(StateTransferInterceptor.java:87)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:107)
	at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:76)
	at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:112)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:113)
	at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:112)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:336)
	at org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:157)
	at org.infinispan.transaction.xa.TransactionXaAdapter.commit(TransactionXaAdapter.java:114)
	at org.infinispan.transaction.tm.DummyTransaction.finishResource(DummyTransaction.java:389)
	at org.infinispan.transaction.tm.DummyTransaction.commitResources(DummyTransaction.java:435)
	at org.infinispan.transaction.tm.DummyTransaction.runCommit(DummyTransaction.java:314)
	at org.infinispan.transaction.tm.DummyTransaction.commit(DummyTransaction.java:105)
	at org.infinispan.transaction.tm.DummyBaseTransactionManager.commit(DummyBaseTransactionManager.java:73)
	at org.wildfly.clustering.ee.infinispan.ActiveTransactionBatch.close(ActiveTransactionBatch.java:48)
	at org.wildfly.clustering.web.undertow.session.DistributableSession.requestDone(DistributableSession.java:77)
	at io.undertow.servlet.spec.ServletContextImpl.updateSessionAccessTime(ServletContextImpl.java:768)
	at io.undertow.servlet.spec.HttpServletResponseImpl.responseDone(HttpServletResponseImpl.java:557)
	at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:331)
	at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:263)
	at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
	at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:174)
	at io.undertow.server.Connectors.executeRootHandler(Connectors.java:202)
	at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:793)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
{noformat}

After tracing using the trace log, we found the time difference between a "Lock all keys" log statement and the "Release locks for keys" log statement takes for about 15~20 seconds, so if another concurrent request would like to lock the same key within that interval, it will result in the timeout exception, example as below:
{noformat}
2016-07-21 13:01:37,958 TRACE [org.infinispan.util.concurrent.locks.impl.DefaultLockManager] (default task-37) Lock all keys=[4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl] for owner=GlobalTransaction:<10.1.3.29>:458:local. timeout=15000 (MILLISECONDS)
2016-07-21 13:01:53,364 TRACE [org.infinispan.util.concurrent.locks.impl.DefaultLockManager] (default task-37) Release locks for keys=[4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl]. owner=GlobalTransaction:<10.1.3.29>:458:local
{noformat}

We think that is strange as those session keys are POJO without complex nested objects, and we only perform read operation on those objects. Although we know objects without marked *@org.wildfly.clustering.web.annotation.Immutable* will perform cache sync, we believe that should not took for more than 1 second to commit the cache but it takes for more than 15 seconds...

After future tracing, we found the following log message on another cluster node:
{noformat}
2016-07-21 13:01:53,410 TRACE [org.infinispan.commands.tx.PrepareCommand] (Incoming-10,ee,10.1.3.30) {} Invoking remotely originated prepare: PrepareCommand {modifications=[PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, value=[B at f9f09c2, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, value=[B at f72eca9, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ, value=[B at 6f108f88, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl, value=org.wildfly.clustering.web.infinispan.session.SimpleSessionAccessMetaData at 126bb5d4, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, value=[B at 1a574fb5, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, value=[B at 3141bcb, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ.menuName, value=[B at ae3a302, flags=[FORCE_SYNCHRONOUS], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, ...], onePhaseCommit=true, retried=false, gtx=GlobalTransaction:<10.1.3.29>:458:remote, cacheName='app.war', topologyId=2} with invocation context: org.infinispan.context.impl.RemoteTxInvocationContext at 3d21e5e9
{noformat}
In general, it synchronized XXX key {color:red}2{color} times, YYY key {color:red}2272{color} times, ZZZ key {color:red}3{color} times!!!

The root cause should be this line: https://github.com/wildfly/wildfly/blob/10.0.0.Final/clustering/web/infinispan/src/main/java/org/wildfly/clustering/web/infinispan/session/fine/FineSessionAttributes.java#L76 where a new CacheEntryMutator.mutate() will trigger a new PutKeyValueCommand, but that is not necessary for transactional/ batched cache in general as the MarshalledValue is lazily serialized, the actual value is pointing to the same instance.

Attached the suggested change, which solved our issue. Basically use a Set to track the mutated attribute names, to ensure the command is only issued once. Tested on non-transactional, batched cache (non-transactional, non-batch cache may not work as I did not think about that...)

  was:
Our system using the following infinispan configuration for web session cache (i.e. an async, non-transactional, batch, optimistic locking dustrubutable cache):
{code:xml}
<cache-container name="web" default-cache="dist" module="org.wildfly.clustering.web.infinispan">
	<transport lock-timeout="60000"/>
	<distributed-cache name="dist" mode="ASYNC" l1-lifespan="0" owners="2">
		<locking isolation="REPEATABLE_READ"/>
		<transaction locking="OPTIMISTIC" mode="BATCH"/>
		<file-store/>
	</distributed-cache>
</cache-container>
{code}

When we perform a load test for one of our business use case in our system, Wildfly always throw the infamous locking for 15 seconds timeout exception:
{noformat}
2016-07-14 16:44:11,585 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (default task-45) ISPN000136: Error executing command PrepareCommand, writing keys [UN3iNlvQSwNtgsNO83cjL4hy8OpV9PYAT_PuUFml->XXX, UN3iNlvQSwNtgsNO83cjL4hy8OpV9PYAT_PuUFml->YYY, UN3iNlvQSwNtgsNO83cjL4hy8OpV9PYAT_PuUFml->ZZZ]: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds for key UN3iNlvQSwNtgsNO83cjL4hy8OpV9PYAT_PuUFml->ZZZ and requestor GlobalTransaction:<10.1.3.29>:22896:local. Lock is held by GlobalTransaction:<10.1.3.29>:21594:local
	at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.lock(DefaultLockManager.java:236)
	at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$CompositeLockPromise.lock(DefaultLockManager.java:305)
	at org.infinispan.interceptors.locking.AbstractLockingInterceptor.lockAllAndRecord(AbstractLockingInterceptor.java:199)
	at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.checkPendingAndLockAllKeys(AbstractTxLockingInterceptor.java:199)
	at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAllOrRegisterBackupLock(AbstractTxLockingInterceptor.java:165)
	at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.visitPrepareCommand(OptimisticLockingInterceptor.java:69)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.interceptors.TxInterceptor.invokeNextInterceptorAndVerifyTransaction(TxInterceptor.java:157)
	at org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:144)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:113)
	at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:112)
	at org.infinispan.statetransfer.TransactionSynchronizerInterceptor.visitPrepareCommand(TransactionSynchronizerInterceptor.java:42)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.statetransfer.StateTransferInterceptor.handleTxCommand(StateTransferInterceptor.java:238)
	at org.infinispan.statetransfer.StateTransferInterceptor.visitPrepareCommand(StateTransferInterceptor.java:87)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:107)
	at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:76)
	at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:112)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
	at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:113)
	at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:112)
	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
	at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:336)
	at org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:157)
	at org.infinispan.transaction.xa.TransactionXaAdapter.commit(TransactionXaAdapter.java:114)
	at org.infinispan.transaction.tm.DummyTransaction.finishResource(DummyTransaction.java:389)
	at org.infinispan.transaction.tm.DummyTransaction.commitResources(DummyTransaction.java:435)
	at org.infinispan.transaction.tm.DummyTransaction.runCommit(DummyTransaction.java:314)
	at org.infinispan.transaction.tm.DummyTransaction.commit(DummyTransaction.java:105)
	at org.infinispan.transaction.tm.DummyBaseTransactionManager.commit(DummyBaseTransactionManager.java:73)
	at org.wildfly.clustering.ee.infinispan.ActiveTransactionBatch.close(ActiveTransactionBatch.java:48)
	at org.wildfly.clustering.web.undertow.session.DistributableSession.requestDone(DistributableSession.java:77)
	at io.undertow.servlet.spec.ServletContextImpl.updateSessionAccessTime(ServletContextImpl.java:768)
	at io.undertow.servlet.spec.HttpServletResponseImpl.responseDone(HttpServletResponseImpl.java:557)
	at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:331)
	at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:263)
	at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
	at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:174)
	at io.undertow.server.Connectors.executeRootHandler(Connectors.java:202)
	at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:793)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
{noformat}

After tracing using the trace log, we found the time difference between a "Lock all keys" log statement and the "Release locks for keys" log statement takes for about 15~20 seconds, so if another concurrent request would like to lock the same key within that interval, it will result in the timeout exception, example as below:
{noformat}
2016-07-21 13:01:37,958 TRACE [org.infinispan.util.concurrent.locks.impl.DefaultLockManager] (default task-37) Lock all keys=[4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl] for owner=GlobalTransaction:<10.1.3.29>:458:local. timeout=15000 (MILLISECONDS)
2016-07-21 13:01:53,364 TRACE [org.infinispan.util.concurrent.locks.impl.DefaultLockManager] (default task-37) Release locks for keys=[4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl]. owner=GlobalTransaction:<10.1.3.29>:458:local
{noformat}

We think that is strange as those session keys are POJO without complex nested objects, and we only perform read operation on those objects. Although we know objects without marked *@org.wildfly.clustering.web.annotation.Immutable* will perform cache sync, we believe that should not took for more than 1 second to commit the cache but it takes for more than 15 seconds...

After future tracing, we found the following log message on another cluster node:
{noformat}
2016-07-21 13:01:53,410 TRACE [org.infinispan.commands.tx.PrepareCommand] (Incoming-10,ee,10.1.3.30) {} Invoking remotely originated prepare: PrepareCommand {modifications=[PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, value=[B at f9f09c2, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, value=[B at f72eca9, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ, value=[B at 6f108f88, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl, value=org.wildfly.clustering.web.infinispan.session.SimpleSessionAccessMetaData at 126bb5d4, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, value=[B at 1a574fb5, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, value=[B at 3141bcb, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ.menuName, value=[B at ae3a302, flags=[FORCE_SYNCHRONOUS], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, ...], onePhaseCommit=true, retried=false, gtx=GlobalTransaction:<10.1.3.29>:458:remote, cacheName='app.war', topologyId=2} with invocation context: org.infinispan.context.impl.RemoteTxInvocationContext at 3d21e5e9
{noformat}
In general, it synchronized XXX key {color:red}2{color} times, YYY key {color:red}2272{color} times, ZZZ key {color:red}3{color} times!!!

The root cause should be this line: https://github.com/wildfly/wildfly/blob/10.0.0.Final/clustering/web/infinispan/src/main/java/org/wildfly/clustering/web/infinispan/session/fine/FineSessionAttributes.java#L76 where a new CacheEntryMutator.mutate() will trigger a new PutKeyValueCommand, but that is not necessary for transactional/ batched cache in general as the MarshalledValue is lazily serialized, the actual value is pointing to the same instance.

Attached the suggested change, which solved our issue. Basically use a Set to track the mutated attribute names, to ensure the command is only issued once. Tested on non-transactional, batched cache (non-transactional, non-batch cache may not work as I did not think about that...)



> Fine session attributes (i.e. replication-granularity = ATTRIBUTE) cause org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: WFLY-6877
>                 URL: https://issues.jboss.org/browse/WFLY-6877
>             Project: WildFly
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 10.0.0.Final
>         Environment: Wildfly 10.0.0-Final
> JDK 1.8.0_66-b17
> Linux
>            Reporter: Sammy Chu
>            Assignee: Paul Ferraro
>            Priority: Critical
>         Attachments: FineSessionAttributes.java
>
>
> Our system using the following infinispan configuration for web session cache (i.e. an async, non-transactional, batch, optimistic locking dustrubutable cache):
> {code:xml}
> <cache-container name="web" default-cache="dist" module="org.wildfly.clustering.web.infinispan">
> 	<transport lock-timeout="60000"/>
> 	<distributed-cache name="dist" mode="ASYNC" l1-lifespan="0" owners="2">
> 		<locking isolation="REPEATABLE_READ"/>
> 		<transaction locking="OPTIMISTIC" mode="BATCH"/>
> 		<file-store/>
> 	</distributed-cache>
> </cache-container>
> {code}
> When we perform a load test for one of our business use case in our system, Wildfly always throw the infamous locking for 15 seconds timeout exception:
> {noformat}
> 2016-07-21 13:01:52,975 ERROR [org.infinispan.interceptors.InvocationContextInterceptor] (default task-33) {batch38003 at e.com} ISPN000136: Error executing command PrepareCommand, writing keys [4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY]: org.infinispan.util.concurrent.TimeoutException: ISPN000299: Unable to acquire lock after 15 seconds for key 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->com.ecv.component.profile.User and requestor GlobalTransaction:<10.1.3.29>:492:local. Lock is held by GlobalTransaction:<10.1.3.29>:458:local
> 	at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$KeyAwareExtendedLockPromise.lock(DefaultLockManager.java:236)
> 	at org.infinispan.util.concurrent.locks.impl.DefaultLockManager$CompositeLockPromise.lock(DefaultLockManager.java:305)
> 	at org.infinispan.interceptors.locking.AbstractLockingInterceptor.lockAllAndRecord(AbstractLockingInterceptor.java:199)
> 	at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.checkPendingAndLockAllKeys(AbstractTxLockingInterceptor.java:199)
> 	at org.infinispan.interceptors.locking.AbstractTxLockingInterceptor.lockAllOrRegisterBackupLock(AbstractTxLockingInterceptor.java:165)
> 	at org.infinispan.interceptors.locking.OptimisticLockingInterceptor.visitPrepareCommand(OptimisticLockingInterceptor.java:69)
> 	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
> 	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
> 	at org.infinispan.interceptors.TxInterceptor.invokeNextInterceptorAndVerifyTransaction(TxInterceptor.java:157)
> 	at org.infinispan.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:144)
> 	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
> 	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
> 	at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:113)
> 	at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:112)
> 	at org.infinispan.statetransfer.TransactionSynchronizerInterceptor.visitPrepareCommand(TransactionSynchronizerInterceptor.java:42)
> 	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
> 	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
> 	at org.infinispan.statetransfer.StateTransferInterceptor.handleTxCommand(StateTransferInterceptor.java:238)
> 	at org.infinispan.statetransfer.StateTransferInterceptor.visitPrepareCommand(StateTransferInterceptor.java:87)
> 	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
> 	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
> 	at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:107)
> 	at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:76)
> 	at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:112)
> 	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
> 	at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:99)
> 	at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:113)
> 	at org.infinispan.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:112)
> 	at org.infinispan.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:173)
> 	at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:336)
> 	at org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:157)
> 	at org.infinispan.transaction.xa.TransactionXaAdapter.commit(TransactionXaAdapter.java:114)
> 	at org.infinispan.transaction.tm.DummyTransaction.finishResource(DummyTransaction.java:389)
> 	at org.infinispan.transaction.tm.DummyTransaction.commitResources(DummyTransaction.java:435)
> 	at org.infinispan.transaction.tm.DummyTransaction.runCommit(DummyTransaction.java:314)
> 	at org.infinispan.transaction.tm.DummyTransaction.commit(DummyTransaction.java:105)
> 	at org.infinispan.transaction.tm.DummyBaseTransactionManager.commit(DummyBaseTransactionManager.java:73)
> 	at org.wildfly.clustering.ee.infinispan.ActiveTransactionBatch.close(ActiveTransactionBatch.java:48)
> 	at org.wildfly.clustering.web.undertow.session.DistributableSession.requestDone(DistributableSession.java:77)
> 	at io.undertow.servlet.spec.ServletContextImpl.updateSessionAccessTime(ServletContextImpl.java:768)
> 	at io.undertow.servlet.spec.HttpServletResponseImpl.responseDone(HttpServletResponseImpl.java:557)
> 	at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:331)
> 	at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:263)
> 	at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
> 	at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:174)
> 	at io.undertow.server.Connectors.executeRootHandler(Connectors.java:202)
> 	at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:793)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {noformat}
> After tracing using the trace log, we found the time difference between a "Lock all keys" log statement and the "Release locks for keys" log statement takes for about 15~20 seconds, so if another concurrent request would like to lock the same key within that interval, it will result in the timeout exception, example as below:
> {noformat}
> 2016-07-21 13:01:37,958 TRACE [org.infinispan.util.concurrent.locks.impl.DefaultLockManager] (default task-37) Lock all keys=[4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl] for owner=GlobalTransaction:<10.1.3.29>:458:local. timeout=15000 (MILLISECONDS)
> 2016-07-21 13:01:53,364 TRACE [org.infinispan.util.concurrent.locks.impl.DefaultLockManager] (default task-37) Release locks for keys=[4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ, 4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl]. owner=GlobalTransaction:<10.1.3.29>:458:local
> {noformat}
> We think that is strange as those session keys are POJO without complex nested objects, and we only perform read operation on those objects. Although we know objects without marked *@org.wildfly.clustering.web.annotation.Immutable* will perform cache sync, we believe that should not took for more than 1 second to commit the cache but it takes for more than 15 seconds...
> After future tracing, we found the following log message on another cluster node:
> {noformat}
> 2016-07-21 13:01:53,410 TRACE [org.infinispan.commands.tx.PrepareCommand] (Incoming-10,ee,10.1.3.30) {} Invoking remotely originated prepare: PrepareCommand {modifications=[PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, value=[B at f9f09c2, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, value=[B at f72eca9, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ, value=[B at 6f108f88, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl, value=org.wildfly.clustering.web.infinispan.session.SimpleSessionAccessMetaData at 126bb5d4, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->XXX, value=[B at 1a574fb5, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->YYY, value=[B at 3141bcb, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, PutKeyValueCommand{key=4V3S9Lw0pGPAK3wt8bYEt3BK06iimxk88geCH2Gl->ZZZ.menuName, value=[B at ae3a302, flags=[FORCE_SYNCHRONOUS], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}, ...], onePhaseCommit=true, retried=false, gtx=GlobalTransaction:<10.1.3.29>:458:remote, cacheName='app.war', topologyId=2} with invocation context: org.infinispan.context.impl.RemoteTxInvocationContext at 3d21e5e9
> {noformat}
> In general, it synchronized XXX key {color:red}2{color} times, YYY key {color:red}2272{color} times, ZZZ key {color:red}3{color} times!!!
> The root cause should be this line: https://github.com/wildfly/wildfly/blob/10.0.0.Final/clustering/web/infinispan/src/main/java/org/wildfly/clustering/web/infinispan/session/fine/FineSessionAttributes.java#L76 where a new CacheEntryMutator.mutate() will trigger a new PutKeyValueCommand, but that is not necessary for transactional/ batched cache in general as the MarshalledValue is lazily serialized, the actual value is pointing to the same instance.
> Attached the suggested change, which solved our issue. Basically use a Set to track the mutated attribute names, to ensure the command is only issued once. Tested on non-transactional, batched cache (non-transactional, non-batch cache may not work as I did not think about that...)



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)



More information about the jboss-jira mailing list