[jboss-user] [JBoss Cache Users] - TimeoutException while running in cluster
sirocchj
do-not-reply at jboss.com
Wed Nov 4 12:12:51 EST 2009
Hi all,
apologies if this was answered already but I have not yet found the solution.
I am running a JBoss cluster, 2 machines, Java 5 with: JBossCache 3.2.0.GA, JBoss 5.1.0.GA and JBossESB 4.6 and essentially performing the following:
| try {
| Node<Object, Object> seriesNode = _addSeries(seriesUuid);
| _sanityCheck(info, seriesNode);
| _storeInCache(seriesNode, message, info);
| if (_isSequenceComplete(seriesNode))
| result = _createAggregatedMessage(seriesNode);
| } catch (Exception e) {
| _logger.error("Exception intercepted while processing message: " + e.getLocalizedMessage());
| e.printStackTrace();
| }
|
relevant code is also the following:
| private Node<Object, Object> _addSeries(final String seriesUuid) throws Exception {
| final Fqn<String> seriesFQN = Fqn.fromRelativeElements(FQN_BASE, seriesUuid);
| try {
| txManager.begin();
| Node<Object, Object> seriesNode = rootNode.addChild(seriesFQN);
| txManager.commit();
| _logger.info("Added series " + seriesUuid + " to cache!");
| return seriesNode;
| } catch (Exception e) {
| _logger.warn("Could not add series " + seriesUuid + " to cache!");
| throw e;
| }
| }
|
| private void _sanityCheck(MessageDetails info, Node<Object, Object> seriesNode) throws Exception {
| try {
| if (info.getMessageTerminationCondition() instanceof Boolean && Boolean.TRUE.equals(info.getMessageTerminationCondition())) {
| txManager.begin();
| seriesNode.put("total", info.getMessageNumber());
| txManager.commit();
| _logger.info("Series total set (" + info.getMessageNumber() + ")");
| }
| } catch (Exception e) {
| _logger.warn("Could not save termination condition in cache!");
| throw e;
| }
| }
|
| private void _storeMessageInCache(final Node<Object, Object> seriesNode, final Message message, final MessageDetails info) throws Exception {
| try {
| txManager.begin();
| Node<Object, Object> messageNode = seriesNode.addChild(Fqn.fromString(info.getMessageNumber().toString()));
| messageNode.put("payload", Util.serialize(message));
| txManager.commit();
| _logger.info("\n******************\nStored message number " + info.getMessageNumber() + ": " + info + "\n******************\n");
| } catch (Exception e) {
| _logger.warn("Could not store message " + info.getMessageNumber());
| throw e;
| }
| }
|
I'm omitting the rest as it is not relevant, as far as I can see.
Also, these are the properties I use:
| <property name="transactionManagerLookupClass">org.jboss.cache.transaction.GenericTransactionManagerLookup</property>
| <property name="clusterName">${jboss.partition.name:DefaultPartition}-MyCache</property>
| <property name="multiplexerStack">${jboss.default.jgroups.stack:udp}</property>
| <property name="fetchInMemoryState">true</property>
| <property name="isolationLevel">READ_COMMITTED</property>
| <property name="writeSkewCheck">true</property>
| <property name="useLockStriping">true</property>
| <property name="lockParentForChildInsertRemove">false</property>
| <property name="cacheMode">REPL_SYNC</property>
| <property name="syncReplTimeout">35000</property>
| <property name="stateRetrievalTimeout">60000</property>
| <property name="lockAcquisitionTimeout">30000</property>
| <property name="useRegionBasedMarshalling">true</property>
| <property name="serializationExecutorPoolSize">0</property>
| <property name="listenerAsyncPoolSize">0</property>
| <property name="cacheLoaderConfig">
| <bean class="org.jboss.cache.config.CacheLoaderConfig">
| <property name="passivation">false</property>
| <property name="shared">true</property>
| <property name="preload">/</property>
| <property name="individualCacheLoaderConfigs">
| <list>
| <bean class="org.jboss.cache.loader.JDBCCacheLoaderConfig">
| <property name="properties">
| cache.jdbc.datasource=java:/pojoCacheDS
| cache.jdbc.table.drop=false
| cache.jdbc.table.name=jbosscache
| cache.jdbc.table.create=true
| cache.jdbc.table.primarykey=jbosscache_pk
| cache.jdbc.fqn.column=fqn
| cache.jdbc.fqn.type=VARCHAR(255)
| cache.jdbc.node.column=node
| cache.jdbc.node.type=BLOB
| cache.jdbc.parent.column=parent
| </property>
| <property name="async">false</property>
| <property name="fetchPersistentState">true</property>
| <property name="purgeOnStartup">false</property>
| <property name="ignoreModifications">false</property>
| </bean>
| </list>
| </property>
| </bean>
| </property>
|
So, when this is run with 3 threads per machine, 2 machines in cluster, 2000 messages coming in asynchronously, at some stage, sometimes, I see in my stacktrace:
| 15:15:48,386 WARN [TxInterceptor] Caught exception, will now set transaction to roll back
| org.jboss.cache.lock.TimeoutException: Unable to acquire lock on Fqn [/aggregator/b41f62ab-4bbf-4314-9f79-341710c4957a/1996] after [30000] milliseconds for requestor [GlobalTransaction:<10.3.0.181:53730>:1329]! Lock held by [GlobalTransaction:<10.1.0.30:58999>:2852]
| at org.jboss.cache.mvcc.MVCCNodeHelper.acquireLock(MVCCNodeHelper.java:159)
| at org.jboss.cache.mvcc.MVCCNodeHelper.wrapNodeForWriting(MVCCNodeHelper.java:253)
| at org.jboss.cache.mvcc.MVCCNodeHelper.wrapNodeForWriting(MVCCNodeHelper.java:186)
| at org.jboss.cache.interceptors.CacheLoaderInterceptor.loadIfNeeded(CacheLoaderInterceptor.java:300)
| at org.jboss.cache.interceptors.CacheLoaderInterceptor.visitPutDataMapCommand(CacheLoaderInterceptor.java:124)
| at org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCommand.java:104)
| at org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
| at org.jboss.cache.interceptors.ReplicationInterceptor.handleCrudMethod(ReplicationInterceptor.java:150)
| at org.jboss.cache.interceptors.ReplicationInterceptor.visitPutDataMapCommand(ReplicationInterceptor.java:113)
| at org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCommand.java:104)
| at org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
| at org.jboss.cache.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:131)
| at org.jboss.cache.commands.AbstractVisitor.visitPutDataMapCommand(AbstractVisitor.java:60)
| at org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCommand.java:104)
| at org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
| at org.jboss.cache.interceptors.TxInterceptor.replayModifications(TxInterceptor.java:501)
| at org.jboss.cache.interceptors.TxInterceptor.handleRemotePrepare(TxInterceptor.java:388)
| at org.jboss.cache.interceptors.TxInterceptor.visitPrepareCommand(TxInterceptor.java:134)
| at org.jboss.cache.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:68)
| at org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
| at org.jboss.cache.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:131)
| at org.jboss.cache.commands.AbstractVisitor.visitPrepareCommand(AbstractVisitor.java:140)
| at org.jboss.cache.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:68)
| at org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:116)
| at org.jboss.cache.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:178)
| at org.jboss.cache.interceptors.InvocationContextInterceptor.visitPrepareCommand(InvocationContextInterceptor.java:106)
| at org.jboss.cache.commands.tx.PrepareCommand.acceptVisitor(PrepareCommand.java:68)
| at org.jboss.cache.interceptors.InterceptorChain.invokeRemote(InterceptorChain.java:316)
| at org.jboss.cache.commands.remote.ReplicateCommand.processSingleCommand(ReplicateCommand.java:139)
| at org.jboss.cache.commands.remote.ReplicateCommand.perform(ReplicateCommand.java:115)
| at org.jboss.cache.marshall.CommandAwareRpcDispatcher.executeCommand(CommandAwareRpcDispatcher.java:319)
| at org.jboss.cache.marshall.InactiveRegionAwareRpcDispatcher.handle(InactiveRegionAwareRpcDispatcher.java:95)
| at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:637)
| at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:545)
| at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:368)
| at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:775)
| at org.jgroups.JChannel.up(JChannel.java:1339)
| at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:462)
| at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:501)
| at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:310)
| at org.jgroups.protocols.FRAG2.up(FRAG2.java:188)
| at org.jgroups.protocols.FC.up(FC.java:473)
| at org.jgroups.protocols.pbcast.GMS.up(GMS.java:824)
| at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:233)
| at org.jgroups.protocols.UNICAST.up(UNICAST.java:299)
| at org.jgroups.protocols.pbcast.NAKACK.handleMessage(NAKACK.java:884)
| at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:716)
| at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
| at org.jgroups.protocols.FD.up(FD.java:284)
| at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
| at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
| at org.jgroups.protocols.Discovery.up(Discovery.java:264)
| at org.jgroups.protocols.PING.up(PING.java:273)
| at org.jgroups.protocols.TP.passMessageUp(TP.java:1285)
| at org.jgroups.protocols.TP.access$100(TP.java:49)
| at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1838)
| at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1812)
| at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651)
| at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676)
| at java.lang.Thread.run(Thread.java:613)
|
| 15:15:48,392 WARN [ClusteredTest] Could not store message 1996
|
10.3.0.181 definitively wants to perform a _storeMessageInCache for Fqn /aggregator/b41f62ab-4bbf-4314-9f79-341710c4957a/1996 while 10.1.0.30 knows nothing about that Fqn but owns some lock...
Btw, /aggregator/b41f62ab-4bbf-4314-9f79-341710c4957a/1996 IS in the DB...
I wonder if this has anything to do with https://jira.jboss.org/jira/browse/JBCACHE-1540 and, if so, if you would recommend the proposed solution (i.e.: has anyone verified the implications of that workaround?)
Kind Regards
\j
View the original post : http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4263990#4263990
Reply to the post : http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4263990
More information about the jboss-user
mailing list