[JBoss JIRA] (ISPN-5106) Deadlock on GlobalComponentRegistry when starting a cluster
by Jakub Markos (JIRA)
[ https://issues.jboss.org/browse/ISPN-5106?page=com.atlassian.jira.plugin.... ]
Jakub Markos updated ISPN-5106:
-------------------------------
Attachment: dumps_and_logs.zip
> Deadlock on GlobalComponentRegistry when starting a cluster
> -----------------------------------------------------------
>
> Key: ISPN-5106
> URL: https://issues.jboss.org/browse/ISPN-5106
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Reporter: Jakub Markos
> Attachments: dumps_and_logs.zip
>
>
> We have a test which starts 4 server nodes, and sometimes they fail to complete the startup. This happens with the current snapshot.
> Here are the relevant parts from the dumps, node02:
> {code}
> "remote-thread--p3-t1" daemon prio=10 tid=0x00007f7a00002800 nid=0x487f waiting for monitor entry [0x00007f796bbfa000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at org.infinispan.factories.AbstractComponentRegistry.getOrCreateComponent(AbstractComponentRegistry.java:262)
> - waiting to lock <0x000000060365b6b8> (a org.infinispan.factories.GlobalComponentRegistry)
> at org.infinispan.factories.AbstractComponentRegistry.invokeInjectionMethod(AbstractComponentRegistry.java:227)
> at org.infinispan.factories.AbstractComponentRegistry.wireDependencies(AbstractComponentRegistry.java:132)
> at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler$2.run(GlobalInboundInvocationHandler.java:156)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Locked ownable synchronizers:
> - <0x0000000615af46d0> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "MSC service thread 1-16" prio=10 tid=0x00007f79ec071800 nid=0x4839 waiting on condition [0x00007f7a40239000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000614d47e60> (a java.util.concurrent.FutureTask)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:422)
> at java.util.concurrent.FutureTask.get(FutureTask.java:199)
> at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:432)
> at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:385)
> at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:368)
> at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:359)
> at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:281)
> - locked <0x000000060420d4a8> (a java.lang.Object)
> at org.infinispan.topology.ClusterTopologyManagerImpl.start(ClusterTopologyManagerImpl.java:103)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
> at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
> at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
> - locked <0x000000060365b6b8> (a org.infinispan.factories.GlobalComponentRegistry)
> at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:221)
> - locked <0x000000060365b6b8> (a org.infinispan.factories.GlobalComponentRegistry)
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:580)
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:546)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:423)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:437)
> at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:89)
> at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:80)
> at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:116)
> at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:113)
> at org.infinispan.security.Security.doPrivileged(Security.java:76)
> at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:60)
> at org.infinispan.server.infinispan.SecurityActions.startCache(SecurityActions.java:121)
> at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:79)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1948)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1881)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Locked ownable synchronizers:
> - <0x0000000653444750> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {code}
> and node03
> {code}
> "remote-thread--p3-t1" daemon prio=10 tid=0x00007f016c079000 nid=0x1a43 waiting for monitor entry [0x00007f0114396000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at org.infinispan.factories.AbstractComponentRegistry.getOrCreateComponent(AbstractComponentRegistry.java:262)
> - waiting to lock <0x0000000609c2bf50> (a org.infinispan.factories.GlobalComponentRegistry)
> at org.infinispan.factories.AbstractComponentRegistry.invokeInjectionMethod(AbstractComponentRegistry.java:227)
> at org.infinispan.factories.AbstractComponentRegistry.wireDependencies(AbstractComponentRegistry.java:132)
> at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler$2.run(GlobalInboundInvocationHandler.java:156)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Locked ownable synchronizers:
> - <0x0000000615a05750> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "MSC service thread 1-16" prio=10 tid=0x00007f015c071800 nid=0x19ff waiting on condition [0x00007f01b0558000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000615025bb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> at org.jgroups.util.CondVar.waitFor(CondVar.java:64)
> at org.jgroups.blocks.Request.waitForResults(Request.java:195)
> at org.jgroups.blocks.Request.responsesComplete(Request.java:181)
> at org.jgroups.blocks.Request.execute(Request.java:89)
> at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:409)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:374)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:188)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:562)
> at org.infinispan.topology.ClusterTopologyManagerImpl.start(ClusterTopologyManagerImpl.java:112)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
> at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
> at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
> - locked <0x0000000609c2bf50> (a org.infinispan.factories.GlobalComponentRegistry)
> at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:221)
> - locked <0x0000000609c2bf50> (a org.infinispan.factories.GlobalComponentRegistry)
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:580)
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:546)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:423)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:437)
> at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:89)
> at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:80)
> at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:116)
> at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:113)
> at org.infinispan.security.Security.doPrivileged(Security.java:76)
> at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:60)
> at org.infinispan.server.infinispan.SecurityActions.startCache(SecurityActions.java:121)
> at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:79)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1948)
> at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1881)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> Locked ownable synchronizers:
> - <0x00000006534e9628> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months
[JBoss JIRA] (ISPN-5106) Deadlock on GlobalComponentRegistry when starting a cluster
by Jakub Markos (JIRA)
Jakub Markos created ISPN-5106:
----------------------------------
Summary: Deadlock on GlobalComponentRegistry when starting a cluster
Key: ISPN-5106
URL: https://issues.jboss.org/browse/ISPN-5106
Project: Infinispan
Issue Type: Bug
Components: Server
Reporter: Jakub Markos
We have a test which starts 4 server nodes, and sometimes they fail to complete the startup. This happens with the current snapshot.
Here are the relevant parts from the dumps, node02:
{code}
"remote-thread--p3-t1" daemon prio=10 tid=0x00007f7a00002800 nid=0x487f waiting for monitor entry [0x00007f796bbfa000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.infinispan.factories.AbstractComponentRegistry.getOrCreateComponent(AbstractComponentRegistry.java:262)
- waiting to lock <0x000000060365b6b8> (a org.infinispan.factories.GlobalComponentRegistry)
at org.infinispan.factories.AbstractComponentRegistry.invokeInjectionMethod(AbstractComponentRegistry.java:227)
at org.infinispan.factories.AbstractComponentRegistry.wireDependencies(AbstractComponentRegistry.java:132)
at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler$2.run(GlobalInboundInvocationHandler.java:156)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Locked ownable synchronizers:
- <0x0000000615af46d0> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"MSC service thread 1-16" prio=10 tid=0x00007f79ec071800 nid=0x4839 waiting on condition [0x00007f7a40239000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000614d47e60> (a java.util.concurrent.FutureTask)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:422)
at java.util.concurrent.FutureTask.get(FutureTask.java:199)
at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:432)
at org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:385)
at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:368)
at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:359)
at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:281)
- locked <0x000000060420d4a8> (a java.lang.Object)
at org.infinispan.topology.ClusterTopologyManagerImpl.start(ClusterTopologyManagerImpl.java:103)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
- locked <0x000000060365b6b8> (a org.infinispan.factories.GlobalComponentRegistry)
at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:221)
- locked <0x000000060365b6b8> (a org.infinispan.factories.GlobalComponentRegistry)
at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:580)
at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:546)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:423)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:437)
at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:89)
at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:80)
at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:116)
at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:113)
at org.infinispan.security.Security.doPrivileged(Security.java:76)
at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:60)
at org.infinispan.server.infinispan.SecurityActions.startCache(SecurityActions.java:121)
at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:79)
at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1948)
at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1881)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Locked ownable synchronizers:
- <0x0000000653444750> (a java.util.concurrent.ThreadPoolExecutor$Worker)
{code}
and node03
{code}
"remote-thread--p3-t1" daemon prio=10 tid=0x00007f016c079000 nid=0x1a43 waiting for monitor entry [0x00007f0114396000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.infinispan.factories.AbstractComponentRegistry.getOrCreateComponent(AbstractComponentRegistry.java:262)
- waiting to lock <0x0000000609c2bf50> (a org.infinispan.factories.GlobalComponentRegistry)
at org.infinispan.factories.AbstractComponentRegistry.invokeInjectionMethod(AbstractComponentRegistry.java:227)
at org.infinispan.factories.AbstractComponentRegistry.wireDependencies(AbstractComponentRegistry.java:132)
at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler$2.run(GlobalInboundInvocationHandler.java:156)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Locked ownable synchronizers:
- <0x0000000615a05750> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"MSC service thread 1-16" prio=10 tid=0x00007f015c071800 nid=0x19ff waiting on condition [0x00007f01b0558000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000615025bb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at org.jgroups.util.CondVar.waitFor(CondVar.java:64)
at org.jgroups.blocks.Request.waitForResults(Request.java:195)
at org.jgroups.blocks.Request.responsesComplete(Request.java:181)
at org.jgroups.blocks.Request.execute(Request.java:89)
at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:409)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:374)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:188)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:562)
at org.infinispan.topology.ClusterTopologyManagerImpl.start(ClusterTopologyManagerImpl.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
- locked <0x0000000609c2bf50> (a org.infinispan.factories.GlobalComponentRegistry)
at org.infinispan.factories.GlobalComponentRegistry.start(GlobalComponentRegistry.java:221)
- locked <0x0000000609c2bf50> (a org.infinispan.factories.GlobalComponentRegistry)
at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:580)
at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:546)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:423)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:437)
at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:89)
at org.jboss.as.clustering.infinispan.DefaultEmbeddedCacheManager.getCache(DefaultEmbeddedCacheManager.java:80)
at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:116)
at org.infinispan.server.infinispan.SecurityActions$4.run(SecurityActions.java:113)
at org.infinispan.security.Security.doPrivileged(Security.java:76)
at org.infinispan.server.infinispan.SecurityActions.doPrivileged(SecurityActions.java:60)
at org.infinispan.server.infinispan.SecurityActions.startCache(SecurityActions.java:121)
at org.jboss.as.clustering.infinispan.subsystem.CacheService.start(CacheService.java:79)
at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1948)
at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1881)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Locked ownable synchronizers:
- <0x00000006534e9628> (a java.util.concurrent.ThreadPoolExecutor$Worker)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months
[JBoss JIRA] (ISPN-5016) Specify and document cache consistency guarantees
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-5016?page=com.atlassian.jira.plugin.... ]
Radim Vansa commented on ISPN-5016:
-----------------------------------
{quote}If a node is suspected because of a Full GC, it might go from the initial JGroups view straight to the merge view. If that happens, its topology will be the largest one, and it will not be wiped, neither will it receive new data. Instead, it will keep the (possibly stale) entries it had before the Full GC.{quote}
Sorry, this is not very clear to me. Why largest topology ID and why wouldn't that propagate its state to all other members (wiping their data)?
{quote}If at least half of the nodes in the stable topology leave in quick succession{quote}
What is quick succession? Does that depend on JGroups view installation, any timeout or the duration of rebalance?
{quote}And if some of the nodes in the Available partition’s consistent hash are not really accessible after the merge, the cache might stay Degraded.{quote}
'the cache'? We should be talking rather about partitions, not the whole cache. I understand the first part of the special case, but not the latter.
{quote}While a partition is in Degraded mode, attempting to read or write a key will yield an AvailabilityException{quote}
Unless all owners of the key are in the degraded partition.
{quote}If a node joins and becomes a backup owner after a write command was sent from the primary owner to the backups, but before the primary owner updates its own data container, it may not receive the value neither as a write command nor via state transfer.{quote}
Sounds like a bug to me - is there a JIRA that could be linked? We could tolerate inconsistencies when the node crashes (if we can't fix it), but join or graceful leave should keep the cluster consistent.
{quote}When a write to the store fails, it will fail the write operation.{quote}
Does this hold for write-behind, too?
{quote}With write-behind or asynchronous replication enabled, store write failures are hidden from the user (unless the originator is the primary owner, with async replication).{quote}
When originator == primary and write-behind, how can the failure to write to the store be propagated to the user? I thought that user thread initiates both write to the store and async replication and returns.
{quote}When the partitions merge back, there is no effort to replicate the values from one partition to another.{quote}
Why is that different from non-tx mode, where the partitions with non-highest topology id are wiped? Moreover, in optimistic tx you write {quote}Same as with pessimistic and non-transactional caches.{quote} - what version, then?
{quote}If the primary owners of the keys written by the transaction are all in the local transaction, {quote}
local partition?
{quote}If one partition stays available, its entries will replace all the other partitions' entries on merge, undoing partial commits in those partitions.{quote}
Do I understand correctly that degraded partition may commit transaction, and this transaction will be later ignored (the data will later be overwritten to the previous values). Why is this behaviour desired?
{quote}Transactions already prepared, however, will commit successfully even in minority partitions{quote}
Is that true even if the originator is not in this minority partition?
{quote}When a transaction needs to acquire more than one key lock with the same primary node, they are always acquired in the same order, so this will not cause a deadlock.{quote}
If the keys have the same hashCodes, they can be locked in different order, though. ISPN-2491
{quote}The commit is always synchronous on the originator, so a transaction T3 started on node A after T1 finished will see T1’s updates.{quote}
Will it see all T1's updates, or just updates on those entries owned by A?
{quote}The write to the attached cache store(s) is performed during the one-phase prepare command or the commit command, depending on the configuration.{quote}
What configuration, exactly?
> Specify and document cache consistency guarantees
> -------------------------------------------------
>
> Key: ISPN-5016
> URL: https://issues.jboss.org/browse/ISPN-5016
> Project: Infinispan
> Issue Type: Task
> Components: Documentation-Core
> Affects Versions: 7.0.2.Final
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
>
> We can't simply use the consistency model defined by Java Specification and broaden it for whole cache (maybe the expression "can't" is too strong, but we definitely don't want to do that in some cases).
> By consistency guarantees/model I mean mostly in which order are
> writes allowed to be observed: and we can't boil it down to simply
> causal, PRAM or any other consistency model as writes can be observed as non-atomic in Infinispan.
> Infinispan documentation is quite scarce about that, the only trace I've
> found is in Glossarry [2] "Infinispan has traditionally followed ACID
> principles as far as possible, however an eventually consistent mode
> embracing BASE is on the roadmap."
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months
[JBoss JIRA] (ISPN-5046) PartitionHandling: split during commit can leave the cache inconsistent after merge
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-5046?page=com.atlassian.jira.plugin.... ]
Radim Vansa commented on ISPN-5046:
-----------------------------------
Is it planned to fix this JIRA? If I understand the comments correctly, Dan has the solution, but in BZ there are comments that this is a 'feature'.
> PartitionHandling: split during commit can leave the cache inconsistent after merge
> -----------------------------------------------------------------------------------
>
> Key: ISPN-5046
> URL: https://issues.jboss.org/browse/ISPN-5046
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.1.0.Beta1
>
>
> Say we have a cluster ABCD; a transaction T was started on A, with B as the primary owner and C the backup owner. B and C both acknowledge the prepare, and the network splits into AB and CD right before A sends the commit command. Eventually A suspects C and D, but the commit still succeeds on B before C and D are suspected. And SuspectExceptions are ignored for commit commands, so the user won't see any error.
> However, C will eventually suspect A and B. When the CD cache topology is installed, it will roll back transaction T. After the merge, both partitions are in degraded mode, so we assume that they both have the latest data and the key is never updated on C.
> From C's point of view, this is very similar to ISPN-3421. The fix should also be similar, we could delay the transaction rollback on C until we get a confirmation from B that T was not committed there. Since B is inaccessible, it will eventually get a SuspectException and the CD cache topology, at which point the cache is in degraded mode and it can wait for a merge. On merge, it should check the status of the transaction on B again, and either commit or rollback based on what B did.
> We also need to suspend the cleanup of completed transactions while the cache is in degraded mode, otherwise C might not find T on B after the merge.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months
[JBoss JIRA] (ISPN-5046) PartitionHandling: split during commit can leave the cache inconsistent after merge
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5046?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-5046:
-----------------------------------------------
Misha H. Ali <mhusnain(a)redhat.com> changed the Status of [bug 1176750|https://bugzilla.redhat.com/show_bug.cgi?id=1176750] from ASSIGNED to ON_QA
> PartitionHandling: split during commit can leave the cache inconsistent after merge
> -----------------------------------------------------------------------------------
>
> Key: ISPN-5046
> URL: https://issues.jboss.org/browse/ISPN-5046
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.1.0.Beta1
>
>
> Say we have a cluster ABCD; a transaction T was started on A, with B as the primary owner and C the backup owner. B and C both acknowledge the prepare, and the network splits into AB and CD right before A sends the commit command. Eventually A suspects C and D, but the commit still succeeds on B before C and D are suspected. And SuspectExceptions are ignored for commit commands, so the user won't see any error.
> However, C will eventually suspect A and B. When the CD cache topology is installed, it will roll back transaction T. After the merge, both partitions are in degraded mode, so we assume that they both have the latest data and the key is never updated on C.
> From C's point of view, this is very similar to ISPN-3421. The fix should also be similar, we could delay the transaction rollback on C until we get a confirmation from B that T was not committed there. Since B is inaccessible, it will eventually get a SuspectException and the CD cache topology, at which point the cache is in degraded mode and it can wait for a merge. On merge, it should check the status of the transaction on B again, and either commit or rollback based on what B did.
> We also need to suspend the cleanup of completed transactions while the cache is in degraded mode, otherwise C might not find T on B after the merge.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months
[JBoss JIRA] (ISPN-5046) PartitionHandling: split during commit can leave the cache inconsistent after merge
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5046?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-5046:
-----------------------------------------------
Misha H. Ali <mhusnain(a)redhat.com> changed the Status of [bug 1176750|https://bugzilla.redhat.com/show_bug.cgi?id=1176750] from NEW to ASSIGNED
> PartitionHandling: split during commit can leave the cache inconsistent after merge
> -----------------------------------------------------------------------------------
>
> Key: ISPN-5046
> URL: https://issues.jboss.org/browse/ISPN-5046
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.1.0.Beta1
>
>
> Say we have a cluster ABCD; a transaction T was started on A, with B as the primary owner and C the backup owner. B and C both acknowledge the prepare, and the network splits into AB and CD right before A sends the commit command. Eventually A suspects C and D, but the commit still succeeds on B before C and D are suspected. And SuspectExceptions are ignored for commit commands, so the user won't see any error.
> However, C will eventually suspect A and B. When the CD cache topology is installed, it will roll back transaction T. After the merge, both partitions are in degraded mode, so we assume that they both have the latest data and the key is never updated on C.
> From C's point of view, this is very similar to ISPN-3421. The fix should also be similar, we could delay the transaction rollback on C until we get a confirmation from B that T was not committed there. Since B is inaccessible, it will eventually get a SuspectException and the CD cache topology, at which point the cache is in degraded mode and it can wait for a merge. On merge, it should check the status of the transaction on B again, and either commit or rollback based on what B did.
> We also need to suspend the cleanup of completed transactions while the cache is in degraded mode, otherwise C might not find T on B after the merge.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months
[JBoss JIRA] (ISPN-5046) PartitionHandling: split during commit can leave the cache inconsistent after merge
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5046?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-5046:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1171073, https://bugzilla.redhat.com/show_bug.cgi?id=1176750
> PartitionHandling: split during commit can leave the cache inconsistent after merge
> -----------------------------------------------------------------------------------
>
> Key: ISPN-5046
> URL: https://issues.jboss.org/browse/ISPN-5046
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.2.Final, 7.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.1.0.Beta1
>
>
> Say we have a cluster ABCD; a transaction T was started on A, with B as the primary owner and C the backup owner. B and C both acknowledge the prepare, and the network splits into AB and CD right before A sends the commit command. Eventually A suspects C and D, but the commit still succeeds on B before C and D are suspected. And SuspectExceptions are ignored for commit commands, so the user won't see any error.
> However, C will eventually suspect A and B. When the CD cache topology is installed, it will roll back transaction T. After the merge, both partitions are in degraded mode, so we assume that they both have the latest data and the key is never updated on C.
> From C's point of view, this is very similar to ISPN-3421. The fix should also be similar, we could delay the transaction rollback on C until we get a confirmation from B that T was not committed there. Since B is inaccessible, it will eventually get a SuspectException and the CD cache topology, at which point the cache is in degraded mode and it can wait for a merge. On merge, it should check the status of the transaction on B again, and either commit or rollback based on what B did.
> We also need to suspend the cleanup of completed transactions while the cache is in degraded mode, otherwise C might not find T on B after the merge.
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months
[JBoss JIRA] (ISPN-5105) Improve distexec demo logging and modify pom
by Alan Field (JIRA)
Alan Field created ISPN-5105:
--------------------------------
Summary: Improve distexec demo logging and modify pom
Key: ISPN-5105
URL: https://issues.jboss.org/browse/ISPN-5105
Project: Infinispan
Issue Type: Enhancement
Components: Demos and Tutorials
Affects Versions: 7.0.2.Final
Reporter: Alan Field
Assignee: Alan Field
Add timestamps to the logging messages to show progress
The project should depend on infinispan-core and not infinispan-embedded
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months
[JBoss JIRA] (ISPN-5016) Specify and document cache consistency guarantees
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-5016?page=com.atlassian.jira.plugin.... ]
Radim Vansa commented on ISPN-5016:
-----------------------------------
{quote}>> Overlapping operations can happen in any order
> What does 'happen' mean? I guess that eventually the state will be observed with those operations in certain order but does it mean that all threads will observe the writes in the same order? I think that JMM does not guarantee anything like that, and we should put emphasis on this if it's the case for Infinispan, too.
I think JMM does guarantee that, if you consider each cache entry to be a volatile variable.
{quote}
OK, my bad, I was not inducing the ordering rules properly...
> Specify and document cache consistency guarantees
> -------------------------------------------------
>
> Key: ISPN-5016
> URL: https://issues.jboss.org/browse/ISPN-5016
> Project: Infinispan
> Issue Type: Task
> Components: Documentation-Core
> Affects Versions: 7.0.2.Final
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
>
> We can't simply use the consistency model defined by Java Specification and broaden it for whole cache (maybe the expression "can't" is too strong, but we definitely don't want to do that in some cases).
> By consistency guarantees/model I mean mostly in which order are
> writes allowed to be observed: and we can't boil it down to simply
> causal, PRAM or any other consistency model as writes can be observed as non-atomic in Infinispan.
> Infinispan documentation is quite scarce about that, the only trace I've
> found is in Glossarry [2] "Infinispan has traditionally followed ACID
> principles as far as possible, however an eventually consistent mode
> embracing BASE is on the roadmap."
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months
[JBoss JIRA] (ISPN-5016) Specify and document cache consistency guarantees
by Radim Vansa (JIRA)
[ https://issues.jboss.org/browse/ISPN-5016?page=com.atlassian.jira.plugin.... ]
Radim Vansa commented on ISPN-5016:
-----------------------------------
{quote}Because data is stored in the shared store after the invalidation command was executed on all the nodes, it’s also possible for a node to read the old value from the shared store after it has executed the invalidation.{quote}
I don't understand the reason very well, could you elaborate a bit more?
{quote}Invalidations triggered by write operations are synchronous and the operation will fail with a SuspectException if another node crashes. The entry will not be updated on the originator, but invalidation will still be performed on the non-crashed nodes.{quote}
I thought SuspectExceptions are silently handled, aren't they?
{quote}For multi-key operations like putAll(), the operation is sent to all the primary owners of the affected keys. When there is a single primary owner for all the affected keys, the operation succeeds and its effects are visible atomically. When there are multiple primary owners, the operation is not atomic: overlapping read operations will see random subsets of the updated values.{quote}
As reads don't acquire any locks, I think that even update on single primary owner can be seen as non-atomic.
> Specify and document cache consistency guarantees
> -------------------------------------------------
>
> Key: ISPN-5016
> URL: https://issues.jboss.org/browse/ISPN-5016
> Project: Infinispan
> Issue Type: Task
> Components: Documentation-Core
> Affects Versions: 7.0.2.Final
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
>
> We can't simply use the consistency model defined by Java Specification and broaden it for whole cache (maybe the expression "can't" is too strong, but we definitely don't want to do that in some cases).
> By consistency guarantees/model I mean mostly in which order are
> writes allowed to be observed: and we can't boil it down to simply
> causal, PRAM or any other consistency model as writes can be observed as non-atomic in Infinispan.
> Infinispan documentation is quite scarce about that, the only trace I've
> found is in Glossarry [2] "Infinispan has traditionally followed ACID
> principles as far as possible, however an eventually consistent mode
> embracing BASE is on the roadmap."
--
This message was sent by Atlassian JIRA
(v6.3.11#6341)
10 years, 9 months