March 2013 - infinispan-issues - Jboss List Archives

[JBoss JIRA] (ISPN-2904) Race condition in cache startup causes state transfer timeout

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2904?page=com.atlassian.jira.plugin.... ] Mircea Markus resolved ISPN-2904. --------------------------------- Resolution: Out of Date as per Paul: I'm pretty sure this bug is a manifestation of https://issues.jboss.org/browse/AS7-5904 The fix is here: https://github.com/jbossas/jboss-as/commit/f2642412dc5ebe34d9cd2221c96ab4... > Race condition in cache startup causes state transfer timeout > ------------------------------------------------------------- > > Key: ISPN-2904 > URL: https://issues.jboss.org/browse/ISPN-2904 > Project: Infinispan > Issue Type: Bug > Components: State transfer > Affects Versions: 5.1.7.Final > Reporter: Dennis Reed > Assignee: Mircea Markus > > When starting multiple caches at the same time (as EAP domain mode deployment does), one cache can timeout during state transfer and abort startup. > This is caused by a race condition where the master node accepts requests while it can't process them because it's still starting. > Because of this, the other node's REQUEST_JOIN is ignored, and it finally times out. > [node1] > 10:47:23,390 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool -- 65) dests=[master:server-two/web], command=CacheViewControlCommand{cache=repl, type=REQUEST_JOIN, sender=master:server-one/web, newViewId=0, newMembers=null, oldViewId=0, oldMembers=null}, mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=60000 > 10:47:23,396 TRACE [org.jgroups.protocols.TCP] (ServerService Thread Pool -- 65) sending msg to master:server-two/web, src=master:server-one/web, headers are RequestCorrelator: id=200, type=REQ, id=7, rsp_expected=true, RSVP: REQ(4), UNICAST2: DATA, seqno=27, TCP: [channel_name=web] > ... > 10:48:23,404 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool -- 65) MSC000001: Failed to start service jboss.infinispan.web.repl: org.jboss.msc.service.StartException in service jboss.infinispan.web.repl: org.infinispan.CacheException: Unable to invoke method public void org.infinispan.statetransfer.BaseStateTransferManagerImpl.waitForJoinToComplete() throws java.lang.InterruptedException on object of type ReplicatedStateTransferManagerImpl > [node2] > 10:47:23,352 TRACE [org.infinispan.factories.GlobalComponentRegistry] (MSC service thread 1-6) Registering component Component{instance=org.infinispan.marshall.jboss.ExternalizerTable@3f9c437d, name=org.infinispan.marshall.jboss.ExternalizerTable} under name org.infinispan.marshall.jboss.ExternalizerTable > ... > 10:47:23,397 TRACE [org.jgroups.protocols.TCP] (OOB-19,null) received [dst: master:server-two/web, src: master:server-one/web (4 headers), size=54 bytes, flags=OOB|DONT_BUNDLE|RSVP], headers are RequestCorrelator: id=200, type=REQ, id=7, rsp_expected=true, RSVP: REQ(4), UNICAST2: DATA, seqno=27, TCP: [channel_name=web] > 10:47:23,398 TRACE [org.jgroups.blocks.RequestCorrelator] (OOB-19,null) calling (org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher) with request 7 > 10:47:23,398 TRACE [org.infinispan.marshall.jboss.ExternalizerTable] (OOB-19,null) Either the marshaller has stopped or hasn't started. Read externalizers are not properly populated: {} > 10:47:23,398 TRACE [org.infinispan.marshall.jboss.ExternalizerTable] (OOB-19,null) Cache manager is shutting down and type (id=74) cannot be resolved (thread not interrupted) > 10:47:23,400 TRACE [org.jgroups.blocks.RequestCorrelator] (OOB-19,null) sending rsp for 7 to master:server-one/web > ... > 10:47:23,522 TRACE [org.infinispan.factories.GlobalComponentRegistry] (ServerService Thread Pool -- 64) Invoking start method public void org.infinispan.marshall.jboss.ExternalizerTable.start() on component org.infinispan.marshall.jboss.ExternalizerTable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2898) Web Cache fails to start when multiple apps are trying to create it

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2898?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-2898: -------------------------------- Assignee: Dan Berindei (was: Mircea Markus) > Web Cache fails to start when multiple apps are trying to create it > ------------------------------------------------------------------- > > Key: ISPN-2898 > URL: https://issues.jboss.org/browse/ISPN-2898 > Project: Infinispan > Issue Type: Bug > Affects Versions: 5.1.7.Final > Reporter: Shay Matasaro > Assignee: Dan Berindei > > When multiple apps are trying to create/connect to the web cache , the fist one get a lock on the cache manager, and prevents the others from starting it in a timely manner -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2892) View installation loop when restarting cache on multiple nodes

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2892?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-2892: -------------------------------- Assignee: Dan Berindei (was: Mircea Markus) > View installation loop when restarting cache on multiple nodes > -------------------------------------------------------------- > > Key: ISPN-2892 > URL: https://issues.jboss.org/browse/ISPN-2892 > Project: Infinispan > Issue Type: Bug > Affects Versions: 5.1.7.Final > Reporter: Dennis Reed > Assignee: Dan Berindei > > Restarting a cache on multiple nodes at the same time can cause the following error: > ERROR [org.infinispan.cacheviews.CacheViewsManagerImpl] (CacheViewInstaller-19,node1/web) ISPN000172: Failed to prepare view CacheView{viewId=18, members=[node2/web]} for cache default-host/test, rolling back to view CacheView{viewId=17, members=[node1/web, node2/web]}: java.util.concurrent.ExecutionException: org.infinispan.CacheException: java.lang.IllegalStateException: default-host/test: Received cache view prepare request after the local node has already shut down > After the initial error, the following error began repeating every second for a few minutes until BaseStateTransferManagerImpl.waitForJoinToComplete() timed out and the cache failed to start: > ERROR [org.infinispan.cacheviews.CacheViewsManagerImpl] (CacheViewInstaller-19,node1/web) ISPN000172: Failed to prepare view CacheView{viewId=21, members=[node2/web]} for cache default-host/test, rolling back to view CacheView{viewId=20, members=[]}: java.util.concurrent.ExecutionException: org.infinispan.CacheException: java.lang.IllegalStateException: Cannot prepare new view CacheView{viewId=21, members=[node2/web]} on cache default-host/test, we are currently preparing view CacheView{viewId=18, members=[node2/web]} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2910) Divide by zero exception on shutdown

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2910?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-2910: -------------------------------- Assignee: Dan Berindei (was: Mircea Markus) > Divide by zero exception on shutdown > ------------------------------------ > > Key: ISPN-2910 > URL: https://issues.jboss.org/browse/ISPN-2910 > Project: Infinispan > Issue Type: Bug > Affects Versions: 5.2.1.Final > Reporter: Paul Ferraro > Assignee: Dan Berindei > > 19:40:50,671 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,shared=udp) ISPN000094: Received new cluster view: [perf20/web|13] [perf20/web, perf21/web] > 19:40:50,754 ERROR [org.infinispan.topology.ClusterTopologyManagerImpl] (notification-thread-0) ISPN000196: Failed to recover cluster state after the current node became the coordinator: java.lang.ArithmeticException: / by zero > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.addPrimaryOwners(DefaultConsistentHashFactory.java:130) > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.rebalanceBuilder(DefaultConsistentHashFactory.java:124) > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.updateMembers(DefaultConsistentHashFactory.java:86) > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.updateMembers(DefaultConsistentHashFactory.java:45) > at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheStatusAfterMerge(ClusterTopologyManagerImpl.java:306) > at org.infinispan.topology.ClusterTopologyManagerImpl.handleNewView(ClusterTopologyManagerImpl.java:236) > at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.handleViewChange(ClusterTopologyManagerImpl.java:597) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.6.0_38] > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [rt.jar:1.6.0_38] > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_38] > at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_38] > at org.infinispan.notifications.AbstractListenerImpl$ListenerInvocation$1.run(AbstractListenerImpl.java:212) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_38] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_38] > at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_38] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2910) Divide by zero exception on shutdown

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2910?page=com.atlassian.jira.plugin.... ] Mircea Markus commented on ISPN-2910: ------------------------------------- seems to happen when there are no nodes in the cluster. > Divide by zero exception on shutdown > ------------------------------------ > > Key: ISPN-2910 > URL: https://issues.jboss.org/browse/ISPN-2910 > Project: Infinispan > Issue Type: Bug > Affects Versions: 5.2.1.Final > Reporter: Paul Ferraro > Assignee: Dan Berindei > > 19:40:50,671 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,shared=udp) ISPN000094: Received new cluster view: [perf20/web|13] [perf20/web, perf21/web] > 19:40:50,754 ERROR [org.infinispan.topology.ClusterTopologyManagerImpl] (notification-thread-0) ISPN000196: Failed to recover cluster state after the current node became the coordinator: java.lang.ArithmeticException: / by zero > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.addPrimaryOwners(DefaultConsistentHashFactory.java:130) > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.rebalanceBuilder(DefaultConsistentHashFactory.java:124) > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.updateMembers(DefaultConsistentHashFactory.java:86) > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.updateMembers(DefaultConsistentHashFactory.java:45) > at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheStatusAfterMerge(ClusterTopologyManagerImpl.java:306) > at org.infinispan.topology.ClusterTopologyManagerImpl.handleNewView(ClusterTopologyManagerImpl.java:236) > at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.handleViewChange(ClusterTopologyManagerImpl.java:597) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.6.0_38] > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [rt.jar:1.6.0_38] > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_38] > at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_38] > at org.infinispan.notifications.AbstractListenerImpl$ListenerInvocation$1.run(AbstractListenerImpl.java:212) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_38] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_38] > at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_38] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2910) Divide by zero exception on shutdown

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2910?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-2910: -------------------------------- Fix Version/s: 5.3.0.Alpha1 5.3.0.Final > Divide by zero exception on shutdown > ------------------------------------ > > Key: ISPN-2910 > URL: https://issues.jboss.org/browse/ISPN-2910 > Project: Infinispan > Issue Type: Bug > Affects Versions: 5.2.1.Final > Reporter: Paul Ferraro > Assignee: Dan Berindei > Fix For: 5.3.0.Alpha1, 5.3.0.Final > > > 19:40:50,671 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,shared=udp) ISPN000094: Received new cluster view: [perf20/web|13] [perf20/web, perf21/web] > 19:40:50,754 ERROR [org.infinispan.topology.ClusterTopologyManagerImpl] (notification-thread-0) ISPN000196: Failed to recover cluster state after the current node became the coordinator: java.lang.ArithmeticException: / by zero > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.addPrimaryOwners(DefaultConsistentHashFactory.java:130) > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.rebalanceBuilder(DefaultConsistentHashFactory.java:124) > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.updateMembers(DefaultConsistentHashFactory.java:86) > at org.infinispan.distribution.ch.DefaultConsistentHashFactory.updateMembers(DefaultConsistentHashFactory.java:45) > at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheStatusAfterMerge(ClusterTopologyManagerImpl.java:306) > at org.infinispan.topology.ClusterTopologyManagerImpl.handleNewView(ClusterTopologyManagerImpl.java:236) > at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener.handleViewChange(ClusterTopologyManagerImpl.java:597) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.6.0_38] > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [rt.jar:1.6.0_38] > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [rt.jar:1.6.0_38] > at java.lang.reflect.Method.invoke(Method.java:597) [rt.jar:1.6.0_38] > at org.infinispan.notifications.AbstractListenerImpl$ListenerInvocation$1.run(AbstractListenerImpl.java:212) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_38] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_38] > at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_38] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2913) putForExternalRead leaves locks

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2913?page=com.atlassian.jira.plugin.... ] Mircea Markus commented on ISPN-2913: ------------------------------------- {quote} In TxDistributionInterceptor.remoteGetAndStoreInL1 locks are acquired. Without a transaction these locks are never released. {quote} This is a transactional cache, so this call should always be executing in the scope of a transaction. How did you make the invocation outside of the scope of a transaction? Also can you please give it a try on a newer than 5.2.1 release: 5.2.5.Final. > putForExternalRead leaves locks > ------------------------------- > > Key: ISPN-2913 > URL: https://issues.jboss.org/browse/ISPN-2913 > Project: Infinispan > Issue Type: Bug > Components: Locking and Concurrency > Affects Versions: 5.2.1.Final > Reporter: Sebastian Tusk > Assignee: Mircea Markus > Priority: Critical > > In TxDistributionInterceptor.remoteGetAndStoreInL1 locks are acquired. Without a transaction these locks are never released. The cache setup is Dist, Async, L1, 2 Nodes, 1 Owner, Optimistic Locking. > In AbstractTxLockingInterceptor.visitGetKeyValueCommand locks are released explicitly if outside of transactions. I fixed this problem by doing the same in OptimisticLockingInterceptor.visitPutKeyValueCommand. It is very likely that this doesn't fix all problems. For instance OptimisticLockingInterceptor.visitPutMapCommand or PessimisticLockingInterceptor. > Cache Config: > <namedCache name="entity"> > <jmxStatistics enabled="true" /> > > <clustering mode="dist"> > <stateTransfer fetchInMemoryState="false" timeout="20000" /> > <async /> > <l1 enabled="true" /> > <hash numOwners="1"/> > </clustering> > <locking isolationLevel="READ_COMMITTED" > lockAcquisitionTimeout="15000" useLockStriping="false" /> > > <eviction maxEntries="10000" strategy="LRU" /> > <expiration maxIdle="100000" wakeUpInterval="5000"/> > <storeAsBinary storeKeysAsBinary="true" storeValuesAsBinary="false" enabled="false" /> > > <transaction transactionMode="TRANSACTIONAL" autoCommit="false" lockingMode="OPTIMISTIC"/> > </namedCache> > Fixed OptimisticLockingInterceptor.visitPutKeyValueCommand: > @Override > public Object visitPutKeyValueCommand(InvocationContext ctx, PutKeyValueCommand command) throws Throwable { > try { > if (command.isConditional()) markKeyAsRead(ctx, command); > return invokeNextInterceptor(ctx, command); > } catch (Throwable te) { > throw cleanLocksAndRethrow(ctx, te); > } finally { > //with putForExternalRead the value might be put into L1 without a transaction > //we need to release any locks for these cases > if (!ctx.isInTxScope()) lockManager.unlockAll(ctx); > } > } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2919) Remote vs local get statistic

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2919?page=com.atlassian.jira.plugin.... ] Mircea Markus updated ISPN-2919: -------------------------------- Fix Version/s: 6.0.0.Final (was: 5.3.0.Final) > Remote vs local get statistic > ----------------------------- > > Key: ISPN-2919 > URL: https://issues.jboss.org/browse/ISPN-2919 > Project: Infinispan > Issue Type: Feature Request > Components: Distributed Cache, JMX, reporting and management > Reporter: Galder Zamarreño > Assignee: Galder Zamarreño > Fix For: 6.0.0.Final > > > {quote} > Are there any stats in JDG which show the number of remote GETs versus local GETs etc ? > I don't think so; just logs. Probably a useful metric for JMX though. > {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2923) provides an optimal way to search and retrieve the first non null value

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2923?page=com.atlassian.jira.plugin.... ] Mircea Markus commented on ISPN-2923: ------------------------------------- Infinispan doesn't allow inserting keys with null values, how did you end up with null values in the cache? I think this makes sense as a general requirement, e.g. stop a map/reduce task if enough values have been found. > provides an optimal way to search and retrieve the first non null value > ----------------------------------------------------------------------- > > Key: ISPN-2923 > URL: https://issues.jboss.org/browse/ISPN-2923 > Project: Infinispan > Issue Type: Feature Request > Components: Distributed Execution and Map/Reduce > Affects Versions: 5.2.5.Final > Reporter: Mathieu Lachance > Assignee: Vladimir Blagojevic > > It would be nice if infinispan could provide a way to the common problem of searching and retrieve the first non null value. > My attempt was to use the map reduce framework, but it would still scan all the keys and value even if the result as been found. Maybe Infinispan could provide a way to "stop" the map-reduce operation on all nodes when asked, ex, when the first non null value has been found ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

[JBoss JIRA] (ISPN-2925) Define cache size using bytes or percent

by Mircea Markus (JIRA)

[ https://issues.jboss.org/browse/ISPN-2925?page=com.atlassian.jira.plugin.... ] Mircea Markus resolved ISPN-2925. --------------------------------- Resolution: Duplicate Issue duplicate of ISPN-863 > Define cache size using bytes or percent > ---------------------------------------- > > Key: ISPN-2925 > URL: https://issues.jboss.org/browse/ISPN-2925 > Project: Infinispan > Issue Type: Feature Request > Components: Configuration > Reporter: Edoardo Schepis > Assignee: Mircea Markus > > Can you add the functionality to define the size of a cache not just based on the number of keys, but also on the bytes or heap percentage. > It would help a lot since keys are not necessarily a valid metric for a sort of capacity planning of a cache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira

11 years, 9 months

1
0
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

infinispan-issues March 2013