[JBoss JIRA] (ISPN-8453) Commit should fail if cache is in degraded mode
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-8453?page=com.atlassian.jira.plugin.... ]
Ryan Emerson updated ISPN-8453:
-------------------------------
Fix Version/s: 9.2.0.CR1
(was: 9.2.0.Beta2)
> Commit should fail if cache is in degraded mode
> -----------------------------------------------
>
> Key: ISPN-8453
> URL: https://issues.jboss.org/browse/ISPN-8453
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 8.1.9.Final, 8.2.8.Final, 9.1.2.Final, 9.2.0.Alpha2
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.2.0.CR1
>
>
> When the originator receives a {{CacheNotFoundResponse}} and the cache is in degraded mode, the transaction is marked as partially completed, but the commit completes successfully.
> I believe that is not correct, because the originator could crash after the commit but before the merge, and in that case the transaction will not be applied on all the owners. The transaction manager will ignore any commit exception in {{NON_XA}}/{{useSynchronization}} mode, but at least in {{FULL_XA}}/{{NON_DURABLE_XA}} mode we can signal to the user that the transaction may be lost.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
8 years, 3 months
[JBoss JIRA] (ISPN-8615) ClusteredLockImplTest.testTryLockWithTimeoutAfterLockWithSmallTimeout random failures
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-8615?page=com.atlassian.jira.plugin.... ]
Ryan Emerson updated ISPN-8615:
-------------------------------
Fix Version/s: 9.2.0.CR1
(was: 9.2.0.Beta2)
> ClusteredLockImplTest.testTryLockWithTimeoutAfterLockWithSmallTimeout random failures
> -------------------------------------------------------------------------------------
>
> Key: ISPN-8615
> URL: https://issues.jboss.org/browse/ISPN-8615
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.2.0.Beta1
> Reporter: Dan Berindei
> Assignee: Katia Aresti
> Labels: testsuite_stability
> Fix For: 9.2.0.CR1
>
>
> {noformat}
> java.lang.AssertionError:
> at org.infinispan.lock.impl.lock.ClusteredLockImplTest.testTryLockWithTimeoutAfterLockWithSmallTimeout(ClusteredLockImplTest.java:94)
> {noformat}
> It happens rarely in CI, but I can reproduce it every time if I change the timeout to 100 ms. IMO the difference between {{testTryLockWithTimeoutAfterLockWithSmallTimeout}} and {{testTryLockWithTimeoutAfterLockWithBigTimeout}} should be that the former waits for {{tryLock(smalltimeout, unit)}} to time out before unlocking, and the latter waits for a little time before unlocking and checks that {{tryLock(bigtimeout, unit)}} still succeeds.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
8 years, 3 months
[JBoss JIRA] (ISPN-8602) ExpirationSingleFileStoreDistListenerFunctionalTest random failures
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-8602?page=com.atlassian.jira.plugin.... ]
Ryan Emerson updated ISPN-8602:
-------------------------------
Fix Version/s: 9.2.0.CR1
(was: 9.2.0.Beta2)
> ExpirationSingleFileStoreDistListenerFunctionalTest random failures
> -------------------------------------------------------------------
>
> Key: ISPN-8602
> URL: https://issues.jboss.org/browse/ISPN-8602
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.2.0.Beta1
> Reporter: Dan Berindei
> Assignee: William Burns
> Labels: testsuite_stability
> Fix For: 9.2.0.CR1
>
>
> Various {{ExpirationSingleFileStoreDistListenerFunctionalTest}} tests are failing in master:
> {noformat}
> [ERROR] testExpirationOfStoreWhenDataNotInMemory[null](org.infinispan.expiration.impl.ExpirationSingleFileStoreDistListenerFunctionalTest) Time elapsed: 0.078 s <<< FAILURE!
> java.lang.AssertionError: expected:<0> but was:<7>
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
> at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252)
> at org.infinispan.expiration.impl.ExpirationStoreListenerFunctionalTest.testExpirationOfStoreWhenDataNotInMemory(ExpirationStoreListenerFunctionalTest.java:55)
> {noformat}
> {noformat}
> [ERROR] testSimpleExpirationLifespan[null](org.infinispan.expiration.impl.ExpirationSingleFileStoreDistListenerFunctionalTest) Time elapsed: 0.045 s <<< FAILURE!
> java.lang.AssertionError: expected:<0> but was:<1>
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
> at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252)
> at org.infinispan.expiration.impl.ExpirationFunctionalTest.testSimpleExpirationLifespan(ExpirationFunctionalTest.java:78)
> at org.infinispan.expiration.impl.ExpirationStoreListenerFunctionalTest.testSimpleExpirationLifespan(ExpirationStoreListenerFunctionalTest.java:37)
> {noformat}
> {noformat}
> [ERROR] testSimpleExpirationMaxIdle[null](org.infinispan.expiration.impl.ExpirationSingleFileStoreDistListenerFunctionalTest) Time elapsed: 0 s <<< FAILURE!
> java.lang.AssertionError: expected:<0> but was:<1>
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
> at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:245)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:252)
> at org.infinispan.expiration.impl.ExpirationFunctionalTest.testSimpleExpirationMaxIdle(ExpirationFunctionalTest.java:86)
> at org.infinispan.expiration.impl.ExpirationStoreListenerFunctionalTest.testSimpleExpirationMaxIdle(ExpirationStoreListenerFunctionalTest.java:44)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
8 years, 3 months
[JBoss JIRA] (ISPN-8587) Coordinator crash in 2-node cluster can lead to invalid cache topology
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-8587?page=com.atlassian.jira.plugin.... ]
Ryan Emerson updated ISPN-8587:
-------------------------------
Fix Version/s: 9.2.0.CR1
(was: 9.2.0.Beta2)
> Coordinator crash in 2-node cluster can lead to invalid cache topology
> ----------------------------------------------------------------------
>
> Key: ISPN-8587
> URL: https://issues.jboss.org/browse/ISPN-8587
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.0.Beta1, 9.1.3.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.2.0.CR1, 9.1.4.Final
>
>
> After the coordinator changes, {{PreferAvailabilityStrategy}} first broadcasts a cache topology with the {{currentCH}} of the "maximum" topology. In the 2nd step it broadcasts a topology that removes all the topology members no longer in the cluster, and in the 3rd step it queues a rebalance with the remaining members.
> If the cluster had only 2 nodes, {{A}} (the coordinator) and {{B}}, and B had not finished joining the cache, the maximum topology has {{A}} as the only member. That means step 2 tries to remove all members, and in the process removes the cache topology from {{ClusterCacheStatus}}. When step 3 tries to rebalance with {{B}} as the only member, it re-initializes {{ClusterCacheStatus}} with topology id 1, and because {{LocalTopologyManager}} already has a higher topology id it will never confirm the rebalance.
> This sometimes happens in {{CacheManagerTest.testRestartReusingConfiguration}}. Like most other tests, it waits for the cache to finish joining before killing a node. But it only waits for the test cache, not for the {{CONFIG}} cache (which has {{awaitInitialTransfer(false)}}). Also, most of the time {{A}} either finishes the rebalance or re-initializes {{ClusterCacheStatus}} and sends a topology update with {{B}} as the only member before leaving. The test only fails if {{B}} doesn't receive or ignores one or more topology updates.
> {noformat}
> 10:37:50,674 INFO (remote-thread-Test-NodeA-p2265-t6:[]) [CLUSTER] ISPN000310: Starting cluster-wide rebalance for cache org.infinispan.CONFIG, topology CacheTopology{id=2, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeA-37820: 256]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-37820: 134, Test-NodeB-59687: 122]}, unionCH=null, phase=READ_OLD_WRITE_ALL, actualMembers=[Test-NodeA-37820, Test-NodeB-59687], persistentUUIDs=[d56ec014-ebb3-4be9-9ce2-91c2982ccb73, 96c95d15-440a-4dc7-915d-5d36ac4257bb]}
> 10:37:51,037 DEBUG (remote-thread-Test-NodeA-p2265-t6:[]) [ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache org.infinispan.CONFIG, topology = CacheTopology{id=3, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeA-37820: 256]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-37820: 134, Test-NodeB-59687: 122]}, unionCH=null, phase=READ_ALL_WRITE_ALL, actualMembers=[Test-NodeA-37820, Test-NodeB-59687], persistentUUIDs=[d56ec014-ebb3-4be9-9ce2-91c2982ccb73, 96c95d15-440a-4dc7-915d-5d36ac4257bb]}, availability mode = AVAILABLE
> 10:37:51,097 DEBUG (remote-thread-Test-NodeA-p2265-t5:[]) [ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache org.infinispan.CONFIG, topology = CacheTopology{id=4, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeA-37820: 256]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-37820: 134, Test-NodeB-59687: 122]}, unionCH=null, phase=READ_NEW_WRITE_ALL, actualMembers=[Test-NodeA-37820, Test-NodeB-59687], persistentUUIDs=[d56ec014-ebb3-4be9-9ce2-91c2982ccb73, 96c95d15-440a-4dc7-915d-5d36ac4257bb]}, availability mode = AVAILABLE
> 10:37:51,203 DEBUG (testng-Test:[]) [ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache org.infinispan.CONFIG, topology = CacheTopology{id=5, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeB-59687: 256]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[Test-NodeB-59687], persistentUUIDs=[96c95d15-440a-4dc7-915d-5d36ac4257bb]}, availability mode = AVAILABLE
> 10:37:51,207 INFO (jgroups-7,Test-NodeB-59687:[]) [CLUSTER] ISPN000094: Received new cluster view for channel ISPN: [Test-NodeB-59687|2] (1) [Test-NodeB-59687]
> *** Here topology updates are ignored
> 10:37:51,340 DEBUG (transport-thread-Test-NodeB-p2311-t5:[Topology-org.infinispan.CONFIG]) [LocalTopologyManagerImpl] Ignoring topology 4 for cache org.infinispan.CONFIG from old coordinator Test-NodeA-37820
> 10:37:51,340 DEBUG (transport-thread-Test-NodeB-p2311-t5:[Topology-org.infinispan.CONFIG]) [LocalTopologyManagerImpl] Ignoring topology 5 for cache org.infinispan.CONFIG from old coordinator Test-NodeA-37820
> 10:37:51,340 DEBUG (transport-thread-Test-NodeB-p2311-t6:[Merge-2]) [ClusterCacheStatus] Recovered 1 partition(s) for cache org.infinispan.CONFIG: [CacheTopology{id=3, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeA-37820: 256]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeA-37820: 134, Test-NodeB-59687: 122]}, unionCH=null, phase=READ_ALL_WRITE_ALL, actualMembers=[Test-NodeA-37820, Test-NodeB-59687], persistentUUIDs=[d56ec014-ebb3-4be9-9ce2-91c2982ccb73, 96c95d15-440a-4dc7-915d-5d36ac4257bb]}]
> 10:37:51,340 DEBUG (transport-thread-Test-NodeB-p2311-t6:[Merge-2]) [ClusterCacheStatus] Updating topologies after merge for cache org.infinispan.CONFIG, current topology = CacheTopology{id=4, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeA-37820: 256]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[Test-NodeA-37820, Test-NodeB-59687], persistentUUIDs=[d56ec014-ebb3-4be9-9ce2-91c2982ccb73, 96c95d15-440a-4dc7-915d-5d36ac4257bb]}, stable topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeA-37820: 256]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[Test-NodeA-37820], persistentUUIDs=[d56ec014-ebb3-4be9-9ce2-91c2982ccb73]}, availability mode = null, resolveConflicts = false
> 10:37:51,340 DEBUG (transport-thread-Test-NodeB-p2311-t6:[Merge-2]) [ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache org.infinispan.CONFIG, topology = CacheTopology{id=4, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeA-37820: 256]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[Test-NodeA-37820, Test-NodeB-59687], persistentUUIDs=[d56ec014-ebb3-4be9-9ce2-91c2982ccb73, 96c95d15-440a-4dc7-915d-5d36ac4257bb]}, availability mode = null
> 10:37:51,340 DEBUG (transport-thread-Test-NodeB-p2311-t6:[Merge-2]) [ClusterTopologyManagerImpl] Updating cluster-wide stable topology for cache org.infinispan.CONFIG, topology = CacheTopology{id=1, rebalanceId=1, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeA-37820: 256]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[Test-NodeA-37820], persistentUUIDs=[d56ec014-ebb3-4be9-9ce2-91c2982ccb73]}
> 10:37:51,340 FATAL (transport-thread-Test-NodeB-p2311-t6:[Merge-2]) [CLUSTER] [Context=org.infinispan.CONFIG]ISPN000313: Lost data because of abrupt leavers [Test-NodeA-37820]
> 10:37:51,340 DEBUG (transport-thread-Test-NodeB-p2311-t6:[Merge-2]) [ClusterCacheStatus] Queueing rebalance for cache org.infinispan.CONFIG with members [Test-NodeB-59687]
> 10:37:51,341 DEBUG (transport-thread-Test-NodeB-p2311-t6:[Topology-org.infinispan.CONFIG]) [LocalTopologyManagerImpl] Updating local topology for cache org.infinispan.CONFIG: CacheTopology{id=4, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeA-37820: 256]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[Test-NodeA-37820, Test-NodeB-59687], persistentUUIDs=[d56ec014-ebb3-4be9-9ce2-91c2982ccb73, 96c95d15-440a-4dc7-915d-5d36ac4257bb]}
> *** The topology is re-initialized, without sending topology update
> 10:37:51,378 DEBUG (transport-thread-Test-NodeB-p2311-t1:[Merge-2]) [ClusterCacheStatus] Queueing rebalance for cache ___defaultcache with members [Test-NodeB-59687]
> 10:37:51,547 INFO (jgroups-7,Test-NodeB-59687:[]) [CLUSTER] ISPN000094: Received new cluster view for channel ISPN: [Test-NodeB-59687|3] (2) [Test-NodeB-59687, Test-NodeA-12100]
> 10:37:51,962 DEBUG (testng-Test:[]) [LocalTopologyManagerImpl] Node Test-NodeA-12100 joining cache org.infinispan.CONFIG
> 10:37:51,964 DEBUG (remote-thread-Test-NodeB-p2309-t6:[]) [ClusterCacheStatus] Queueing rebalance for cache org.infinispan.CONFIG with members [Test-NodeB-59687, Test-NodeA-12100]
> *** Rebalance start is sent with wrong topology id
> 10:37:51,964 INFO (remote-thread-Test-NodeB-p2309-t6:[]) [CLUSTER] ISPN000310: Starting cluster-wide rebalance for cache org.infinispan.CONFIG, topology CacheTopology{id=2, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeB-59687: 256]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeB-59687: 129, Test-NodeA-12100: 127]}, unionCH=null, phase=READ_OLD_WRITE_ALL, actualMembers=[Test-NodeB-59687, Test-NodeA-12100], persistentUUIDs=[96c95d15-440a-4dc7-915d-5d36ac4257bb, 538b5324-cda9-49df-9786-7c6d6458332e]}
> 10:37:51,965 DEBUG (transport-thread-Test-NodeB-p2311-t4:[Topology-org.infinispan.CONFIG]) [LocalTopologyManagerImpl] Ignoring old rebalance for cache org.infinispan.CONFIG, current topology is 4: CacheTopology{id=2, rebalanceId=2, currentCH=ReplicatedConsistentHash{ns = 256, owners = (1)[Test-NodeB-59687: 256]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeB-59687: 129, Test-NodeA-12100: 127]}, unionCH=null, phase=READ_OLD_WRITE_ALL, actualMembers=[Test-NodeB-59687, Test-NodeA-12100], persistentUUIDs=[96c95d15-440a-4dc7-915d-5d36ac4257bb, 538b5324-cda9-49df-9786-7c6d6458332e]}
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
8 years, 3 months