[infinispan-issues] [JBoss JIRA] (ISPN-7070) CacheNotifierImplInitialTransferDistTest.testIterationBeganAndSegmentNotComplete random failures
William Burns (JIRA)
issues at jboss.org
Wed Nov 15 17:45:00 EST 2017
[ https://issues.jboss.org/browse/ISPN-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490660#comment-13490660 ]
William Burns commented on ISPN-7070:
-------------------------------------
So the log shows that the REMOVE event was sent back to the node where the listener lives, but for some reason it wasn't notified....
{code}
12:54:48,687 TRACE (remote-thread-Test-NodeA-p49731-t6:[]) [ClusterEventCallable] Received cluster event(s) [ClusterEvent {type=CACHE_ENTRY_REMOVED, cache=Cache 'DistInitialTransferListener'@Test-NodeA-2391, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, retryCommand=false, origin=Test-NodeB-8079}], notifying cluster listener with id a9ff1f8a-10de-4eab-a545-93da4e734e1a
12:54:48,688 TRACE (remote-thread-Test-NodeA-p49731-t6:[]) [JGroupsTransport] Test-NodeA-2391 sending response for request 9 to Test-NodeB-8079: SuccessfulResponse(null)
{code}
This shows that the remove command was enqueued into the listener as it should have been.
Also looking at the trace I see that a large subset of segments were not completed before the test failed. I am not quite sure how this was possible though as the segments should have been all completed. This could be a timing issue with the test or a bug with the segment notifications. I will have to look further.
Also I found a bug with the test itself. This test is designed to have the operation performed after the listener has been notified of the given entry. But somehow the operation can be fired before this notification and thus only providing 10 events instead of 12. In my testing this failure occurs much more frequently and shouldn't be that hard to fix.
> CacheNotifierImplInitialTransferDistTest.testIterationBeganAndSegmentNotComplete random failures
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-7070
> URL: https://issues.jboss.org/browse/ISPN-7070
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 9.0.0.Alpha4
> Reporter: Dan Berindei
> Assignee: William Burns
> Priority: Critical
> Labels: testsuite_stability
> Fix For: 9.0.0.CR4
>
> Attachments: CacheNotifierImplInitialTransferDistTest_ISPN-5467_CompletableFuture-like_API_20161003.log, CacheNotifierImplInitialTransferDistTest_ISPN-7919_RpcManager_ResponseCollector_20171107.log.gz
>
>
> Looks like sometimes the originator listener ignores the CACHE_ENTRY_REMOVED notification:
> {noformat}
> 09:58:57,877 INFO (testng-Test:[]) [TestSuiteProgress] Test starting: org.infinispan.notifications.cachelistener.CacheNotifierImplInitialTransferDistTest.testRemoveAfterIterationBeganAndSegmentNotCompleteValueNonOwnerClustered
> 09:58:58,913 DEBUG (testng-Test:[]) [Test] Found key key-to-change-0 with primary owner != Test-NodeA-7051, segment 60
> 09:58:58,930 TRACE (remote-thread-Test-NodeA-p23726-t6:[]) [ReadCommittedEntry] Updating entry (key=key-to-change-0 removed=false valid=true changed=true created=true value=initial-value metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, providedMetadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null})
> 09:58:58,972 TRACE (remote-thread-Test-NodeC-p23887-t6:[]) [ReadCommittedEntry] Updating entry (key=key-to-change-0 removed=false valid=true changed=true created=true value=initial-value metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, providedMetadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null})
> 09:58:59,014 TRACE (ForkThread-1,Test:[]) [Test] Adding segment listener 2a41ec3f-c79f-4b7d-bd0e-8754170d8fc1 : org.infinispan.notifications.cachelistener.DistributedQueueingSegmentListener at 68007f8
> 09:58:59,014 TRACE (ForkThread-1,Test:[]) [CacheNotifierImpl] Replicating cluster listener to other nodes [Test-NodeA-7051, Test-NodeB-10318, Test-NodeC-52778] for cluster listener with id 2a41ec3f-c79f-4b7d-bd0e-8754170d8fc1
> 09:58:58,998 TRACE (testng-Test:[]) [CheckPoint] Waiting for event pre_complete_segment_invoked * 1
> 09:58:59,063 TRACE (remote-thread-Test-NodeC-p23887-t6:[]) [ClusterListenerReplicateCallable] Registered local cluster listener for remote cluster listener from origin Test-NodeA-7051 with id 2a41ec3f-c79f-4b7d-bd0e-8754170d8fc1
> 09:58:59,098 TRACE (remote-thread-Test-NodeB-p23775-t6:[]) [ClusterListenerReplicateCallable] Registered local cluster listener for remote cluster listener from origin Test-NodeA-7051 with id 2a41ec3f-c79f-4b7d-bd0e-8754170d8fc1
> 09:58:59,145 TRACE (ForkThread-1,Test:[]) [CacheNotifierImpl] Listener 2a41ec3f-c79f-4b7d-bd0e-8754170d8fc1 requests initial state for cache
> 09:58:59,373 TRACE (ForkThread-1,Test:[]) [Test] Completed segments [48, 2, 229, 118, 71, 72, 233, 60]
> 09:58:59,373 TRACE (ForkThread-1,Test:[]) [Test] Listener received event EventImpl{type=CACHE_ENTRY_CREATED, pre=false, cache=Cache 'DistInitialTransferListener'@Test-NodeA-7051, key=key-to-change-0, value=initial-value, oldValue=null, transaction=null, originLocal=true, transactionSuccessful=false, entries=null, created=false}
> 09:58:59,373 TRACE (ForkThread-1,Test:[]) [Test] Notified for key key-to-change-0
> 09:58:59,373 TRACE (ForkThread-1,Test:[]) [CheckPoint] Triggering event pre_complete_segment_invoked * 1 (available = 1, total = 1)
> 09:58:59,373 TRACE (ForkThread-1,Test:[]) [CheckPoint] Waiting for event pre_complete_segment_released * 1
> 09:58:59,373 TRACE (testng-Test:[]) [CheckPoint] Received event pre_complete_segment_invoked * 1 (available = 0, total = 1)
> 09:58:59,577 TRACE (remote-thread-Test-NodeA-p23726-t6:[]) [ReadCommittedEntry] Updating entry (key=key-to-change-0 removed=true valid=false changed=true created=false value=initial-value metadata=EmbeddedMetadata{version=null}, providedMetadata=null)
> 09:58:59,661 TRACE (remote-thread-Test-NodeC-p23887-t5:[]) [ReadCommittedEntry] Updating entry (key=key-to-change-0 removed=true valid=false changed=true created=false value=initial-value metadata=EmbeddedMetadata{version=null}, providedMetadata=null)
> 09:58:59,661 TRACE (remote-thread-Test-NodeC-p23887-t5:[]) [RequestCorrelator] Test-NodeC-52778: invoking unicast RPC [req-id=91126] on Test-NodeA-7051
> 09:58:59,684 TRACE (OOB-2,Test-NodeA-7051:[]) [RequestCorrelator] calling (org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher) with request 91126
> 09:58:59,779 TRACE (remote-thread-Test-NodeA-p23726-t6:[]) [NonTotalOrderPerCacheInboundInvocationHandler] Calling perform() on DistributedExecuteCommand [cache=Cache 'DistInitialTransferListener'@Test-NodeA-7051, keys=[], callable=org.infinispan.notifications.cachelistener.cluster.ClusterEventCallable at 640b0234]
> 09:58:59,780 TRACE (remote-thread-Test-NodeA-p23726-t6:[]) [ClusterEventCallable] Received cluster event(s) [ClusterEvent {type=CACHE_ENTRY_REMOVED, cache=Cache 'DistInitialTransferListener'@Test-NodeA-7051, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, retryCommand=false, origin=Test-NodeC-52778}], notifying cluster listener with id 2a41ec3f-c79f-4b7d-bd0e-8754170d8fc1
> 09:58:59,837 TRACE (testng-Test:[]) [CheckPoint] Triggering event pre_complete_segment_released * 999999999 (available = 999999999, total = 999999999)
> 09:58:59,837 TRACE (ForkThread-1,Test:[]) [CheckPoint] Received event pre_complete_segment_released * 1 (available = 999999998, total = 999999999)
> 09:58:59,898 TRACE (ForkThread-1,Test:[]) [CacheNotifierImpl] Listener 2a41ec3f-c79f-4b7d-bd0e-8754170d8fc1 initial state for cache completed
> 09:58:59,916 TRACE (remote-thread-Test-NodeB-p23775-t6:[]) [ClusterListenerRemoveCallable] Removing local cluster listener due to parent cluster listener was removed : 2a41ec3f-c79f-4b7d-bd0e-8754170d8fc1
> 09:58:59,921 TRACE (remote-thread-Test-NodeC-p23887-t5:[]) [ClusterListenerRemoveCallable] Removing local cluster listener due to parent cluster listener was removed : 2a41ec3f-c79f-4b7d-bd0e-8754170d8fc1
> 09:58:59,943 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.notifications.cachelistener.CacheNotifierImplInitialTransferDistTest.testRemoveAfterIterationBeganAndSegmentNotCompleteValueNonOwnerClustered
> java.lang.AssertionError: expected [12] but found [11]
> {noformat}
> {{CacheNotifierImplInitialTransferDistTest}} also had a different problem: it replaced the {{CacheNotifier}} component with a clone before adding the listener, but replaced it again with the original before removing the listener. Because the listener's uuid was only registered in the clone, {{removeListener}} couldn't find the uuid and disn't try to remove the listener on the other nodes. This meant the listener was leaked, resulting in logs like this:
> {noformat}
> 20:12:04,098 TRACE (remote-thread-Test-NodeB-p24127-t6:[]) [ReadCommittedEntry] Updating entry (key=key-to-change-0 removed=true valid=false changed=true created=false value=initial-value metadata=EmbeddedMetadata{version=null}, providedMetadata=null)
> 20:12:04,099 TRACE (remote-thread-Test-NodeB-p24127-t6:[]) [RemoteClusterListener] Passing Event to manager EventImpl{type=CACHE_ENTRY_REMOVED, pre=false, cache=Cache 'DistInitialTransferListener'@Test-NodeB-56142, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, originLocal=false, transactionSuccessful=false, entries=null, created=false} to send to Test-NodeA-38047
> 20:12:04,100 TRACE (remote-thread-Test-NodeB-p24127-t6:[]) [RemoteClusterListener] Passing Event to manager EventImpl{type=CACHE_ENTRY_REMOVED, pre=false, cache=Cache 'DistInitialTransferListener'@Test-NodeB-56142, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, originLocal=false, transactionSuccessful=false, entries=null, created=false} to send to Test-NodeA-38047
> 20:12:04,100 TRACE (remote-thread-Test-NodeB-p24127-t6:[]) [RemoteClusterListener] Passing Event to manager EventImpl{type=CACHE_ENTRY_REMOVED, pre=false, cache=Cache 'DistInitialTransferListener'@Test-NodeB-56142, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, originLocal=false, transactionSuccessful=false, entries=null, created=false} to send to Test-NodeA-38047
> 20:12:04,100 TRACE (remote-thread-Test-NodeB-p24127-t6:[]) [RemoteClusterListener] Passing Event to manager EventImpl{type=CACHE_ENTRY_REMOVED, pre=false, cache=Cache 'DistInitialTransferListener'@Test-NodeB-56142, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, originLocal=false, transactionSuccessful=false, entries=null, created=false} to send to Test-NodeA-38047
> 20:12:04,100 TRACE (remote-thread-Test-NodeB-p24127-t6:[]) [RemoteClusterListener] Passing Event to manager EventImpl{type=CACHE_ENTRY_REMOVED, pre=false, cache=Cache 'DistInitialTransferListener'@Test-NodeB-56142, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, originLocal=false, transactionSuccessful=false, entries=null, created=false} to send to Test-NodeA-38047
> 20:12:04,106 TRACE (remote-thread-Test-NodeA-p24103-t6:[]) [MultiClusterEventCallable] Received multiple cluster event(s) {86f77630-b1bb-4a21-97bb-b03efeae39d6=[ClusterEvent {type=CACHE_ENTRY_REMOVED, cache=Cache 'DistInitialTransferListener'@Test-NodeA-38047, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, retryCommand=false, origin=Test-NodeB-56142}], 91d05a50-4a35-4e37-8c5c-bffbabc6154b=[ClusterEvent {type=CACHE_ENTRY_REMOVED, cache=Cache 'DistInitialTransferListener'@Test-NodeA-38047, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, retryCommand=false, origin=Test-NodeB-56142}], ba09480f-c698-431b-8fe6-213f1bdefc7b=[ClusterEvent {type=CACHE_ENTRY_REMOVED, cache=Cache 'DistInitialTransferListener'@Test-NodeA-38047, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, retryCommand=false, origin=Test-NodeB-56142}], 992da014-0516-4f2a-8093-50ba3b1f358a=[ClusterEvent {type=CACHE_ENTRY_REMOVED, cache=Cache 'DistInitialTransferListener'@Test-NodeA-38047, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, retryCommand=false, origin=Test-NodeB-56142}], b4eeb9b3-85c4-4ff9-865e-076b7cf93faa=[ClusterEvent {type=CACHE_ENTRY_REMOVED, cache=Cache 'DistInitialTransferListener'@Test-NodeA-38047, key=key-to-change-0, value=null, oldValue=initial-value, transaction=null, retryCommand=false, origin=Test-NodeB-56142}]}
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
More information about the infinispan-issues
mailing list