[infinispan-issues] [JBoss JIRA] (ISPN-3842) Inconsistent L1 in non-tx distributed cache in certain circumstances

William Burns (JIRA) issues at jboss.org
Wed Dec 18 09:05:32 EST 2013


    [ https://issues.jboss.org/browse/ISPN-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932371#comment-12932371 ] 

William Burns commented on ISPN-3842:
-------------------------------------

Okay looking a bit closer I think I know what the issue is here.  The InvalidateL1Command comes in and finds no value in the data container so it doesn't store anything in the invocation context.  Then the get completes and commits the data to the container.  Then the InvalidateL1Command checks the data container and finds the value and delegates the call to the RemoveCommand which it extends.  Unfortunately the RemoveCommand only marks entries that are in the context for removal.  So in this case the InvalidateL1Command should wrap the newly found value from the data container and put it in the context so the RemoveCommand can properly remove it.

I will write up a test which should reproduce this consistently and fix this up.
                
> Inconsistent L1 in non-tx distributed cache in certain circumstances
> --------------------------------------------------------------------
>
>                 Key: ISPN-3842
>                 URL: https://issues.jboss.org/browse/ISPN-3842
>             Project: Infinispan
>          Issue Type: Bug
>    Affects Versions: 6.0.0.Final
>            Reporter: Mikolaj Gierulski
>            Assignee: William Burns
>
> In my poc environment there are two nodes in dist non-tx sync cluster with L1 enabled and numOwners=1.
> Node A, in a loop, reads one key (K), which is stored on node B (in test case it performs about 1 000 000 reads per second).
> From time to time K is updated on node B. This causes an L1 invalidation message sent to A, and K is fetched from B upon next read attempt.
> But whenever I run my test, I come to a situation, where updates of K no longer invalidate it on A, and A sees old value of K.
> When this happens, I can see in logs of node A:
> {noformat}
> 18:21:33.296 [remote-thread-0] TRACE o.i.i.d.L1NonTxInterceptor - L1 invalidation found a pending update for key K - need to block until finished
> 18:21:33.296 [remote-thread-0] TRACE o.i.i.d.L1NonTxInterceptor - Pending L1 update completed successfully: true - L1 invalidation can occur for key K
> 18:21:33.296 [remote-thread-0] TRACE o.i.c.write.InvalidateL1Command - Preparing to invalidate keys [K]
> 18:21:33.296 [remote-thread-0] TRACE o.i.c.write.InvalidateL1Command - Invalidating key K.
> 18:21:33.296 [remote-thread-0] TRACE o.i.commands.write.RemoveCommand - Nothing to remove since the entry is null or we have a null entry
> {noformat}
> While logs of node B show:
> {noformat}
> 18:21:33.200 [OOB-1,AGST-2012000591-33853] TRACE o.i.distribution.L1ManagerImpl - Registering requestor AGST-2012000591-25400 for key 'K'
> 18:21:33.266 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - Invalidating keys [K] on nodes [AGST-2012000591-25400]. Use multicast? false
> 18:21:33.269 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Allowing entry to commit as local node is owner
> 18:21:33.269 [OOB-2,AGST-2012000591-33853] TRACE o.i.distribution.L1ManagerImpl - Registering requestor AGST-2012000591-25400 for key 'K'
> 18:21:33.269 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Sending additional invalidation for requestors if necessary.
> 18:21:33.269 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - Invalidating keys [K] on nodes [AGST-2012000591-25400]. Use multicast? false
> 18:21:33.270 [remote-thread-1] INFO  p.c.a.ispn.WriteTask - Update task runtime millis 3
> 18:21:33.271 [OOB-1,AGST-2012000591-33853] TRACE o.i.distribution.L1ManagerImpl - Registering requestor AGST-2012000591-25400 for key 'K'
> 18:21:33.293 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - Invalidating keys [K] on nodes [AGST-2012000591-25400]. Use multicast? false
> 18:21:33.295 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Allowing entry to commit as local node is owner
> 18:21:33.295 [OOB-2,AGST-2012000591-33853] TRACE o.i.distribution.L1ManagerImpl - Registering requestor AGST-2012000591-25400 for key 'K'
> 18:21:33.295 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Sending additional invalidation for requestors if necessary.
> 18:21:33.295 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - Invalidating keys [K] on nodes [AGST-2012000591-25400]. Use multicast? false
> 18:21:33.295 [remote-thread-1] INFO  p.c.a.ispn.WriteTask - Update task runtime millis 2
> 18:21:33.476 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - No L1 caches to invalidate for keys [K]
> 18:21:33.476 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Allowing entry to commit as local node is owner
> 18:21:33.476 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Sending additional invalidation for requestors if necessary.
> 18:21:33.476 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - No L1 caches to invalidate for keys [K]
> {noformat}
> So it seems, that after this: 
> bq. 'L1 invalidation found a pending update for key K - need to block until finished' 
> B no longer knows A holds K in L1, and no longer sends invalidation commands after updates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list