]
William Burns edited comment on ISPN-3842 at 12/18/13 9:05 AM:
---------------------------------------------------------------
Okay looking a bit closer I think I know what the issue is here. The InvalidateL1Command
comes in and finds no value in the data container so it doesn't store anything in the
invocation context. Then the get completes and commits the data to the container. Then
the InvalidateL1Command checks the data container and finds the value and delegates the
call to the RemoveCommand which it extends. Unfortunately the RemoveCommand only marks
entries that are in the context for removal. So in this case the InvalidateL1Command
should wrap the newly found value from the data container and put it in the context so the
RemoveCommand can properly remove it.
To be honest I thought we had this case tested already, but we must have a slightly
different variant. I will write up a test which should reproduce this consistently and
fix this up.
was (Author: william.burns):
Okay looking a bit closer I think I know what the issue is here. The
InvalidateL1Command comes in and finds no value in the data container so it doesn't
store anything in the invocation context. Then the get completes and commits the data to
the container. Then the InvalidateL1Command checks the data container and finds the value
and delegates the call to the RemoveCommand which it extends. Unfortunately the
RemoveCommand only marks entries that are in the context for removal. So in this case the
InvalidateL1Command should wrap the newly found value from the data container and put it
in the context so the RemoveCommand can properly remove it.
I will write up a test which should reproduce this consistently and fix this up.
Inconsistent L1 in non-tx distributed cache in certain circumstances
--------------------------------------------------------------------
Key: ISPN-3842
URL:
https://issues.jboss.org/browse/ISPN-3842
Project: Infinispan
Issue Type: Bug
Affects Versions: 6.0.0.Final
Reporter: Mikolaj Gierulski
Assignee: William Burns
In my poc environment there are two nodes in dist non-tx sync cluster with L1 enabled and
numOwners=1.
Node A, in a loop, reads one key (K), which is stored on node B (in test case it performs
about 1 000 000 reads per second).
From time to time K is updated on node B. This causes an L1 invalidation message sent to
A, and K is fetched from B upon next read attempt.
But whenever I run my test, I come to a situation, where updates of K no longer
invalidate it on A, and A sees old value of K.
When this happens, I can see in logs of node A:
{noformat}
18:21:33.296 [remote-thread-0] TRACE o.i.i.d.L1NonTxInterceptor - L1 invalidation found a
pending update for key K - need to block until finished
18:21:33.296 [remote-thread-0] TRACE o.i.i.d.L1NonTxInterceptor - Pending L1 update
completed successfully: true - L1 invalidation can occur for key K
18:21:33.296 [remote-thread-0] TRACE o.i.c.write.InvalidateL1Command - Preparing to
invalidate keys [K]
18:21:33.296 [remote-thread-0] TRACE o.i.c.write.InvalidateL1Command - Invalidating key
K.
18:21:33.296 [remote-thread-0] TRACE o.i.commands.write.RemoveCommand - Nothing to remove
since the entry is null or we have a null entry
{noformat}
While logs of node B show:
{noformat}
18:21:33.200 [OOB-1,AGST-2012000591-33853] TRACE o.i.distribution.L1ManagerImpl -
Registering requestor AGST-2012000591-25400 for key 'K'
18:21:33.266 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - Invalidating keys
[K] on nodes [AGST-2012000591-25400]. Use multicast? false
18:21:33.269 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Allowing entry to
commit as local node is owner
18:21:33.269 [OOB-2,AGST-2012000591-33853] TRACE o.i.distribution.L1ManagerImpl -
Registering requestor AGST-2012000591-25400 for key 'K'
18:21:33.269 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Sending additional
invalidation for requestors if necessary.
18:21:33.269 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - Invalidating keys
[K] on nodes [AGST-2012000591-25400]. Use multicast? false
18:21:33.270 [remote-thread-1] INFO p.c.a.ispn.WriteTask - Update task runtime millis 3
18:21:33.271 [OOB-1,AGST-2012000591-33853] TRACE o.i.distribution.L1ManagerImpl -
Registering requestor AGST-2012000591-25400 for key 'K'
18:21:33.293 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - Invalidating keys
[K] on nodes [AGST-2012000591-25400]. Use multicast? false
18:21:33.295 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Allowing entry to
commit as local node is owner
18:21:33.295 [OOB-2,AGST-2012000591-33853] TRACE o.i.distribution.L1ManagerImpl -
Registering requestor AGST-2012000591-25400 for key 'K'
18:21:33.295 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Sending additional
invalidation for requestors if necessary.
18:21:33.295 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - Invalidating keys
[K] on nodes [AGST-2012000591-25400]. Use multicast? false
18:21:33.295 [remote-thread-1] INFO p.c.a.ispn.WriteTask - Update task runtime millis 2
18:21:33.476 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - No L1 caches to
invalidate for keys [K]
18:21:33.476 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Allowing entry to
commit as local node is owner
18:21:33.476 [remote-thread-1] TRACE o.i.i.d.L1NonTxInterceptor - Sending additional
invalidation for requestors if necessary.
18:21:33.476 [remote-thread-1] TRACE o.i.distribution.L1ManagerImpl - No L1 caches to
invalidate for keys [K]
{noformat}
So it seems, that after this:
bq. 'L1 invalidation found a pending update for key K - need to block until
finished'
B no longer knows A holds K in L1, and no longer sends invalidation commands after
updates.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: