[infinispan-issues] [JBoss JIRA] (ISPN-2688) BaseDistributionInterceptor and TxDistributionInterceptor do not detect properly if the key needs to be fetched remotely

Adrian Nistor (JIRA) jira-events at lists.jboss.org
Wed Jan 23 11:27:47 EST 2013


     [ https://issues.jboss.org/browse/ISPN-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrian Nistor updated ISPN-2688:
--------------------------------

        Summary: BaseDistributionInterceptor and TxDistributionInterceptor do not detect properly if the key needs to be fetched remotely  (was: StateTransferLargeObjectTest.testForFailure still failing randomly)
    Description: 
BaseDistributionInterceptor.shouldFetchFromRemote() and TxDistributionInterceptor.remoteGetAndStoreInL1() can mistakenly decide not to fetch remotely because they check the presence of the key in data container. The key may be there _now_ but it was not there before the local execution if state transfer was in progress for this key. So it should be re-fetched rather than use the null result.

This makes StateTransferLargeObjectTest.testForFailure fail randomly.

The failure appears because the state transfer has not finished, yet the distribution interceptor doesn't go remotely for the key:

{noformat}
14:06:45,872 TRACE (asyncTransportThread-1,NodeA:___defaultcache) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=7, currentCH=DefaultConsistentHash{numSegments=60, numOwners=3, members=[NodeA-52814, NodeB-62397, NodeC-63995], owners={0: 0 1 2, 1: 0 1 2, 2: 0 1 2, 3: 0 1 2, 4: 0 1 2, 5: 0 1 2, 6: 0 1 2, 7: 0 1 2, 8: 0 1 2, 9: 0 1 2, 10: 0 1 2, 11: 0 1 2, 12: 0 2, 13: 0 2, 14: 0 2, 15: 0 1, 16: 0 1, 17: 0 1, 18: 0 1, 19: 0 1, 20: 2 0, 21: 2 0, 22: 2 0, 23: 2 0, 24: 2 0, 25: 2 0, 26: 2 0, 27: 2 0, 28: 2 0, 29: 2 0, 30: 1 0 2, 31: 1 0 2, 32: 1 0 2, 33: 1 2, 34: 1 0, 35: 1 2, 36: 1 0, 37: 1 2, 38: 1 0, 39: 1 2, 40: 1 0, 41: 1 2, 42: 1 0, 43: 1 2, 44: 1 0, 45: 1 0, 46: 1 0, 47: 1 0, 48: 1 0, 49: 1 2, 50: 2 1, 51: 2 1, 52: 2 1, 53: 2 1, 54: 2 1, 55: 2 1, 56: 2 1, 57: 2 1, 58: 2 0, 59: 2 0}, pendingCH=DefaultConsistentHash{numSegments=60, numOwners=3, members=[NodeA-52814, NodeB-62397, NodeC-63995], owners={0: 0 1 2, 1: 0 1 2, 2: 0 1 2, 3: 0 1 2, 4: 0 1 2, 5: 0 1 2, 6: 0 1 2, 7: 0 1 2, 8: 0 1 2, 9: 0 1 2, 10: 0 1 2, 11: 0 1 2, 12: 0 2 1, 13: 0 2 1, 14: 0 2 1, 15: 0 1 2, 16: 0 1 2, 17: 0 1 2, 18: 0 1 2, 19: 0 1 2, 20: 2 0 1, 21: 2 0 1, 22: 2 0 1, 23: 2 0 1, 24: 2 0 1, 25: 2 0 1, 26: 2 0 1, 27: 2 0 1, 28: 2 0 1, 29: 2 0 1, 30: 1 0 2, 31: 1 0 2, 32: 1 0 2, 33: 1 2 0, 34: 1 0 2, 35: 1 2 0, 36: 1 0 2, 37: 1 2 0, 38: 1 0 2, 39: 1 2 0, 40: 1 0 2, 41: 1 2 0, 42: 1 0 2, 43: 1 2 0, 44: 1 0 2, 45: 1 0 2, 46: 1 0 2, 47: 1 0 2, 48: 1 0 2, 49: 1 2 0, 50: 2 1 0, 51: 2 1 0, 52: 2 1 0, 53: 2 1 0, 54: 2 1 0, 55: 2 1 0, 56: 2 1 0, 57: 2 1 0, 58: 2 0 1, 59: 2 0 1}} on cache ___defaultcache
14:06:46,287 TRACE (OOB-9,ISPN,NodeA-52814:___defaultcache) [StateConsumerImpl] Received keys [0, 2, 10, 94, 103, 117, 187, 189, 288, 305, 307, 376, 481, 487, 502, 729, 771, 994] for segment 50 of cache ___defaultcache from node NodeB-62397
14:06:46,338 INFO  (testng-StateTransferLargeObjectTest:) [StateTransferLargeObjectTest] ----Running a get on 10
14:06:46,351 TRACE (OOB-9,ISPN,NodeA-52814:___defaultcache ___defaultcache) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=10, value=org.infinispan.statetransfer.BigObject at 6ed3126d, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, lifespanMillis=-1, maxIdleTimeMillis=-1, successful=true} and InvocationContext [org.infinispan.context.impl.LocalTxInvocationContext at 2c6dd013]
14:06:46,358 TRACE (testng-StateTransferLargeObjectTest:___defaultcache) [GetKeyValueCommand] Entry not found
14:06:46,358 TRACE (testng-StateTransferLargeObjectTest:___defaultcache) [BaseDistributionInterceptor] Not doing a remote get for key 10 since entry is mapped to current node (NodeA-52814), or is in L1.  Owners are [NodeC-63995, NodeB-62397, NodeA-52814]
14:06:46,359 ERROR (testng-StateTransferLargeObjectTest:) [UnitTestTestNGListener] Test testForFailure(org.infinispan.statetransfer.StateTransferLargeObjectTest) failed.
java.lang.AssertionError: expected object to not be null
	at org.testng.Assert.fail(Assert.java:89)
	at org.testng.Assert.assertNotNull(Assert.java:399)
	at org.testng.Assert.assertNotNull(Assert.java:384)
	at org.infinispan.statetransfer.StateTransferLargeObjectTest.assertValue(StateTransferLargeObjectTest.java:145)
	at org.infinispan.statetransfer.StateTransferLargeObjectTest.testForFailure(StateTransferLargeObjectTest.java:115)
{noformat}

  was:
The failure appears because the state transfer has not finished, yet the distribution interceptor doesn't go remotely for the key:

{noformat}
14:06:45,872 TRACE (asyncTransportThread-1,NodeA:___defaultcache) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=7, currentCH=DefaultConsistentHash{numSegments=60, numOwners=3, members=[NodeA-52814, NodeB-62397, NodeC-63995], owners={0: 0 1 2, 1: 0 1 2, 2: 0 1 2, 3: 0 1 2, 4: 0 1 2, 5: 0 1 2, 6: 0 1 2, 7: 0 1 2, 8: 0 1 2, 9: 0 1 2, 10: 0 1 2, 11: 0 1 2, 12: 0 2, 13: 0 2, 14: 0 2, 15: 0 1, 16: 0 1, 17: 0 1, 18: 0 1, 19: 0 1, 20: 2 0, 21: 2 0, 22: 2 0, 23: 2 0, 24: 2 0, 25: 2 0, 26: 2 0, 27: 2 0, 28: 2 0, 29: 2 0, 30: 1 0 2, 31: 1 0 2, 32: 1 0 2, 33: 1 2, 34: 1 0, 35: 1 2, 36: 1 0, 37: 1 2, 38: 1 0, 39: 1 2, 40: 1 0, 41: 1 2, 42: 1 0, 43: 1 2, 44: 1 0, 45: 1 0, 46: 1 0, 47: 1 0, 48: 1 0, 49: 1 2, 50: 2 1, 51: 2 1, 52: 2 1, 53: 2 1, 54: 2 1, 55: 2 1, 56: 2 1, 57: 2 1, 58: 2 0, 59: 2 0}, pendingCH=DefaultConsistentHash{numSegments=60, numOwners=3, members=[NodeA-52814, NodeB-62397, NodeC-63995], owners={0: 0 1 2, 1: 0 1 2, 2: 0 1 2, 3: 0 1 2, 4: 0 1 2, 5: 0 1 2, 6: 0 1 2, 7: 0 1 2, 8: 0 1 2, 9: 0 1 2, 10: 0 1 2, 11: 0 1 2, 12: 0 2 1, 13: 0 2 1, 14: 0 2 1, 15: 0 1 2, 16: 0 1 2, 17: 0 1 2, 18: 0 1 2, 19: 0 1 2, 20: 2 0 1, 21: 2 0 1, 22: 2 0 1, 23: 2 0 1, 24: 2 0 1, 25: 2 0 1, 26: 2 0 1, 27: 2 0 1, 28: 2 0 1, 29: 2 0 1, 30: 1 0 2, 31: 1 0 2, 32: 1 0 2, 33: 1 2 0, 34: 1 0 2, 35: 1 2 0, 36: 1 0 2, 37: 1 2 0, 38: 1 0 2, 39: 1 2 0, 40: 1 0 2, 41: 1 2 0, 42: 1 0 2, 43: 1 2 0, 44: 1 0 2, 45: 1 0 2, 46: 1 0 2, 47: 1 0 2, 48: 1 0 2, 49: 1 2 0, 50: 2 1 0, 51: 2 1 0, 52: 2 1 0, 53: 2 1 0, 54: 2 1 0, 55: 2 1 0, 56: 2 1 0, 57: 2 1 0, 58: 2 0 1, 59: 2 0 1}} on cache ___defaultcache
14:06:46,287 TRACE (OOB-9,ISPN,NodeA-52814:___defaultcache) [StateConsumerImpl] Received keys [0, 2, 10, 94, 103, 117, 187, 189, 288, 305, 307, 376, 481, 487, 502, 729, 771, 994] for segment 50 of cache ___defaultcache from node NodeB-62397
14:06:46,338 INFO  (testng-StateTransferLargeObjectTest:) [StateTransferLargeObjectTest] ----Running a get on 10
14:06:46,351 TRACE (OOB-9,ISPN,NodeA-52814:___defaultcache ___defaultcache) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=10, value=org.infinispan.statetransfer.BigObject at 6ed3126d, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, lifespanMillis=-1, maxIdleTimeMillis=-1, successful=true} and InvocationContext [org.infinispan.context.impl.LocalTxInvocationContext at 2c6dd013]
14:06:46,358 TRACE (testng-StateTransferLargeObjectTest:___defaultcache) [GetKeyValueCommand] Entry not found
14:06:46,358 TRACE (testng-StateTransferLargeObjectTest:___defaultcache) [BaseDistributionInterceptor] Not doing a remote get for key 10 since entry is mapped to current node (NodeA-52814), or is in L1.  Owners are [NodeC-63995, NodeB-62397, NodeA-52814]
14:06:46,359 ERROR (testng-StateTransferLargeObjectTest:) [UnitTestTestNGListener] Test testForFailure(org.infinispan.statetransfer.StateTransferLargeObjectTest) failed.
java.lang.AssertionError: expected object to not be null
	at org.testng.Assert.fail(Assert.java:89)
	at org.testng.Assert.assertNotNull(Assert.java:399)
	at org.testng.Assert.assertNotNull(Assert.java:384)
	at org.infinispan.statetransfer.StateTransferLargeObjectTest.assertValue(StateTransferLargeObjectTest.java:145)
	at org.infinispan.statetransfer.StateTransferLargeObjectTest.testForFailure(StateTransferLargeObjectTest.java:115)
{noformat}


    
> BaseDistributionInterceptor and TxDistributionInterceptor do not detect properly if the key needs to be fetched remotely
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ISPN-2688
>                 URL: https://issues.jboss.org/browse/ISPN-2688
>             Project: Infinispan
>          Issue Type: Bug
>          Components: State transfer
>    Affects Versions: 5.2.0.Beta6
>            Reporter: Dan Berindei
>            Assignee: Mircea Markus
>            Priority: Critical
>             Fix For: 5.2.0.Final
>
>         Attachments: stlot.log.gz
>
>
> BaseDistributionInterceptor.shouldFetchFromRemote() and TxDistributionInterceptor.remoteGetAndStoreInL1() can mistakenly decide not to fetch remotely because they check the presence of the key in data container. The key may be there _now_ but it was not there before the local execution if state transfer was in progress for this key. So it should be re-fetched rather than use the null result.
> This makes StateTransferLargeObjectTest.testForFailure fail randomly.
> The failure appears because the state transfer has not finished, yet the distribution interceptor doesn't go remotely for the key:
> {noformat}
> 14:06:45,872 TRACE (asyncTransportThread-1,NodeA:___defaultcache) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=7, currentCH=DefaultConsistentHash{numSegments=60, numOwners=3, members=[NodeA-52814, NodeB-62397, NodeC-63995], owners={0: 0 1 2, 1: 0 1 2, 2: 0 1 2, 3: 0 1 2, 4: 0 1 2, 5: 0 1 2, 6: 0 1 2, 7: 0 1 2, 8: 0 1 2, 9: 0 1 2, 10: 0 1 2, 11: 0 1 2, 12: 0 2, 13: 0 2, 14: 0 2, 15: 0 1, 16: 0 1, 17: 0 1, 18: 0 1, 19: 0 1, 20: 2 0, 21: 2 0, 22: 2 0, 23: 2 0, 24: 2 0, 25: 2 0, 26: 2 0, 27: 2 0, 28: 2 0, 29: 2 0, 30: 1 0 2, 31: 1 0 2, 32: 1 0 2, 33: 1 2, 34: 1 0, 35: 1 2, 36: 1 0, 37: 1 2, 38: 1 0, 39: 1 2, 40: 1 0, 41: 1 2, 42: 1 0, 43: 1 2, 44: 1 0, 45: 1 0, 46: 1 0, 47: 1 0, 48: 1 0, 49: 1 2, 50: 2 1, 51: 2 1, 52: 2 1, 53: 2 1, 54: 2 1, 55: 2 1, 56: 2 1, 57: 2 1, 58: 2 0, 59: 2 0}, pendingCH=DefaultConsistentHash{numSegments=60, numOwners=3, members=[NodeA-52814, NodeB-62397, NodeC-63995], owners={0: 0 1 2, 1: 0 1 2, 2: 0 1 2, 3: 0 1 2, 4: 0 1 2, 5: 0 1 2, 6: 0 1 2, 7: 0 1 2, 8: 0 1 2, 9: 0 1 2, 10: 0 1 2, 11: 0 1 2, 12: 0 2 1, 13: 0 2 1, 14: 0 2 1, 15: 0 1 2, 16: 0 1 2, 17: 0 1 2, 18: 0 1 2, 19: 0 1 2, 20: 2 0 1, 21: 2 0 1, 22: 2 0 1, 23: 2 0 1, 24: 2 0 1, 25: 2 0 1, 26: 2 0 1, 27: 2 0 1, 28: 2 0 1, 29: 2 0 1, 30: 1 0 2, 31: 1 0 2, 32: 1 0 2, 33: 1 2 0, 34: 1 0 2, 35: 1 2 0, 36: 1 0 2, 37: 1 2 0, 38: 1 0 2, 39: 1 2 0, 40: 1 0 2, 41: 1 2 0, 42: 1 0 2, 43: 1 2 0, 44: 1 0 2, 45: 1 0 2, 46: 1 0 2, 47: 1 0 2, 48: 1 0 2, 49: 1 2 0, 50: 2 1 0, 51: 2 1 0, 52: 2 1 0, 53: 2 1 0, 54: 2 1 0, 55: 2 1 0, 56: 2 1 0, 57: 2 1 0, 58: 2 0 1, 59: 2 0 1}} on cache ___defaultcache
> 14:06:46,287 TRACE (OOB-9,ISPN,NodeA-52814:___defaultcache) [StateConsumerImpl] Received keys [0, 2, 10, 94, 103, 117, 187, 189, 288, 305, 307, 376, 481, 487, 502, 729, 771, 994] for segment 50 of cache ___defaultcache from node NodeB-62397
> 14:06:46,338 INFO  (testng-StateTransferLargeObjectTest:) [StateTransferLargeObjectTest] ----Running a get on 10
> 14:06:46,351 TRACE (OOB-9,ISPN,NodeA-52814:___defaultcache ___defaultcache) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=10, value=org.infinispan.statetransfer.BigObject at 6ed3126d, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, lifespanMillis=-1, maxIdleTimeMillis=-1, successful=true} and InvocationContext [org.infinispan.context.impl.LocalTxInvocationContext at 2c6dd013]
> 14:06:46,358 TRACE (testng-StateTransferLargeObjectTest:___defaultcache) [GetKeyValueCommand] Entry not found
> 14:06:46,358 TRACE (testng-StateTransferLargeObjectTest:___defaultcache) [BaseDistributionInterceptor] Not doing a remote get for key 10 since entry is mapped to current node (NodeA-52814), or is in L1.  Owners are [NodeC-63995, NodeB-62397, NodeA-52814]
> 14:06:46,359 ERROR (testng-StateTransferLargeObjectTest:) [UnitTestTestNGListener] Test testForFailure(org.infinispan.statetransfer.StateTransferLargeObjectTest) failed.
> java.lang.AssertionError: expected object to not be null
> 	at org.testng.Assert.fail(Assert.java:89)
> 	at org.testng.Assert.assertNotNull(Assert.java:399)
> 	at org.testng.Assert.assertNotNull(Assert.java:384)
> 	at org.infinispan.statetransfer.StateTransferLargeObjectTest.assertValue(StateTransferLargeObjectTest.java:145)
> 	at org.infinispan.statetransfer.StateTransferLargeObjectTest.testForFailure(StateTransferLargeObjectTest.java:115)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira



More information about the infinispan-issues mailing list