I was gunning for beta3 but I don't think I'm going to make it.
On Wed, Oct 26, 2011 at 12:38 PM, Galder Zamarreño <galder(a)redhat.com> wrote:
On Oct 26, 2011, at 9:46 AM, Dan Berindei wrote:
> Hi Galder, sorry it took so long to reply.
>
> On Mon, Oct 24, 2011 at 4:16 PM, Galder Zamarreño <galder(a)redhat.com> wrote:
>> Btw, forgot to attach the log:
>>
>>
>>
>>
>> On Oct 24, 2011, at 3:13 PM, Galder Zamarreño wrote:
>>
>>> Hi Dan,
>>>
>>> Re:
http://goo.gl/TGwrP
>>>
>>> There's a few of this in the Hot Rod server+client testsuites. It's
easy to replicate it locally. Seems like cache operations right after a cache has started
are rather problematic.
>>>
>>> In local execution of HotRodReplicationTest, I was able to replicate the
issue when trying to test topology changes. Please find attached the log file, but
here're the interesting bits:
>>>
>>> 1. A new view installation is being prepared with NodeA and NodeB:
>>> 2011-10-24 14:36:09,046 4221 TRACE
[org.infinispan.cacheviews.CacheViewsManagerImpl]
(OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) ___hotRodTopologyCache:
Preparing cache view CacheView{viewId=4, members=[NodeA-63227, NodeB-15806]}, committed
view is CacheView{viewId=3, members=[NodeA-63227, NodeB-15806, NodeC-17654]}
>>> …
>>> 2011-10-24 14:36:09,047 4222 DEBUG
[org.infinispan.statetransfer.StateTransferLockImpl]
(OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Blocking new transactions
>>> 2011-10-24 14:36:09,047 4222 TRACE
[org.infinispan.statetransfer.StateTransferLockImpl]
(OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Acquiring exclusive state
transfer shared lock, shared holders: 0
>>> 2011-10-24 14:36:09,047 4222 TRACE
[org.infinispan.statetransfer.StateTransferLockImpl]
(OOB-1,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Acquired state transfer lock
in exclusive mode
>>>
>>> 2. The cluster coordinator discovers a view change and requests NodeA and
NodeB to remove NodeC from the topology view:
>>> 2011-10-24 14:36:09,048 4223 TRACE
[org.infinispan.interceptors.InvocationContextInterceptor]
(OOB-3,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) Invoked with command
RemoveCommand{key=NodeC-17654, value=null, flags=null} and InvocationContext
[NonTxInvocationContext{flags=null}]
>>>
>>> 3. NodeB has not yet finished installing the cache view, so that remove times
out:
>>> 2011-10-24 14:36:09,049 4224 ERROR
[org.infinispan.interceptors.InvocationContextInterceptor]
(OOB-3,Infinispan-Cluster,NodeB-15806:___hotRodTopologyCache) ISPN000136: Execution error
>>> org.infinispan.distribution.RehashInProgressException: Timed out waiting for
the transaction lock
>>>
>>> A way to solve this is to avoid relying on cluster view changes, but instead
wait for the cache view to be installed, and then do the operations then. Is there any way
to wait till then?
>>>
>>> One way would be to have some CacheView installed callbacks or similar. This
could be a good option cos I could have a CacheView listener for the hot rod topology
cache whose callbacks I can check for isPre=false and then do the cache ops safely.
>>>
>
> Initially I was thinking of allowing multiple cache view listeners for
> each cache and making StateTransferManager one of them but I decided
> against it because I realized it needs a different interface than our
> regular listeners. I know that it was only a matter of time until
> someone needed it...
>
> An alternative solution would be to retry all operations, like we do
> with commits now, when we receive a RehashInProgressException
> exception from the remote node. That's what I was planning to do first
> as it helps in other use cases as well.
Ok, do you have time to include this today ahead of the BETA3 release?
I think this is a very important fix cos as you can see in the testsuite, it's very
easy to get this error with Hot Rod servers.
>
>>> Otherwise, code like this the one I used for keeping the Hot Rod topology is
gonna be racing against your cache view installation code.
>>>
>>> You seem to have some pieces in place for this, i.e. CacheViewListener, but
it seems only designed for internal core/ work.
>>>
>>> Any other suggestions?
>>>
>>> Cheers,
>>> --
>>> Galder Zamarreño
>>> Sr. Software Engineer
>>> Infinispan, JBoss Cache
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Galder Zamarreño
>> Sr. Software Engineer
>> Infinispan, JBoss Cache
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev