[infinispan-dev] Fwd: Stale data read when L1 invalidation happens while UnionConsistentHash is in use

galder at jboss.org galder at jboss.org
Fri May 7 12:58:40 EDT 2010


After looking at this for a bit longer, I think your suggestion on "4.  Modify the value on C1 *before* rehashing completes." does not represent what happens in the log.

Instead, I think it should be like this:

"4. Modify the value on C3 *before* rehashing completes." 

When you do that, it uses the new installed hash without doing a union. It's the existing nodes that use a union. So, if you were to do 4. C1, it will have a union hash and hence it will replicate to all nodes in the cluster and the test would pass.

The key thing here is doing 4. on C3, while C1 or C2 have the union hashing going on. C3 forces the update but this is not invalidated C1 or C2, so one of them will lose out since C3 does not do a union hash.

I'll do more research next week.

----- galder at jboss.org wrote:

> See below:
> 
> ----- "Manik Surtani" <manik at jboss.org> wrote:
> 
> > On 3 May 2010, at 08:51, Galder Zamarreno wrote:
> >
> > > Resending without log until the message is approved.
> > >
> > > --
> > > Galder Zamarreño
> > > Sr. Software Engineer
> > > Infinispan, JBoss Cache
> > >
> > > ----- Forwarded Message -----
> > > From: galder at redhat.com
> > > To: "infinispan -Dev List" <infinispan-dev at lists.jboss.org>
> > > Sent: Friday, April 30, 2010 6:30:05 PM GMT +01:00 Amsterdam /
> > Berlin / Bern / Rome / Stockholm / Vienna
> > > Subject: Stale data read when L1 invalidation happens while
> > UnionConsistentHash is in use
> > >
> > > Hi,
> > >
> > > I've spent all day chasing down a random Hot Rod testsuite failure
> > related to distribution. This is the last hurdle to close
> > https://jira.jboss.org/jira/browse/ISPN-411. In
> > HotRodDistributionTest, which is still to be committed, I test
> adding
> > a new node, doing a put on this node, and then doing a get in a
> > different node and making sure that I get what was put. The test
> > randomly fails saying that the get returns the old value. The
> failure
> > is nothing to do with Hot Rod itself but rather a race condition
> where
> > union consistent hash is used. Let me explain:
> > >
> > > 1. An earlier operation had set
> > "k-testDistributedPutWithTopologyChanges" key to
> > "v5-testDistributedPutWithTopologyChanges".
> > > 2. Start a new hot rod server in eq-7969.
> > > 2. eq-7969 node calls a put on that key with
> > "v6-testDistributedPutWithTopologyChanges". Recipients for the put
> > are: eq-7969 and eq-61332.
> > > 3. eq-7969 sends an invalidate L1 to all, including eq-13415
> > > 4. eq-13415 should invalidate
> > "k-testDistributedPutWithTopologyChanges" but it doesn't, since it
> > considers that "k-testDistributedPutWithTopologyChanges" is local to
> > eq-13415:
> > >
> > > 2010-04-30 18:02:19,907 6046  TRACE
> > [org.infinispan.distribution.DefaultConsistentHash]
> > (OOB-2,Infinispan-Cluster,eq-13415:) Hash code for key
> > CacheKey{data=ByteArray{size=39, hashCode=17b1683, array=[107, 45,
> > 116, 101, 115, 116, 68, 105, 115, 116, ..]}} is 344897059
> > > 2010-04-30 18:02:19,907 6046  TRACE
> > [org.infinispan.distribution.DefaultConsistentHash]
> > (OOB-2,Infinispan-Cluster,eq-13415:) Candidates for key
> > CacheKey{data=ByteArray{size=39, hashCode=17b1683, array=[107, 45,
> > 116, 101, 115, 116, 68, 105, 115, 116, ..]}} are {5458=eq-7969,
> > 6831=eq-61332}
> > > 2010-04-30 18:02:19,907 6046  TRACE
> > [org.infinispan.distribution.DistributionManagerImpl]
> > (OOB-2,Infinispan-Cluster,eq-13415:) Is local
> > CacheKey{data=ByteArray{size=39, hashCode=17b1683, array=[107, 45,
> > 116, 101, 115, 116, 68, 105, 115, 116, ..]}} to eq-13415 query
> returns
> > true and consistentHash is
> > org.infinispan.distribution.UnionConsistentHash at 10747b4
> > >
> > > This is a log with log messages that I added to debug it. The key
> > factor here is that UnionConsistentHash is in use, probably due to
> > rehashing not having fully finished.
> > >
> > > 5. The end result is that a read of
> > "k-testDistributedPutWithTopologyChanges" in eq-13415 returns
> > "v5-testDistributedPutWithTopologyChanges".
> > >
> > > I thought that maybe we could be more conservative here and if
> > rehashing is in progress (or UnionConsistentHash is in use)
> invalidate
> > regardless. Assuming that a put always follows an invalidation in
> > distribution and not viceversa, that would be fine. The only
> downside
> > is that you'd be invalidating too much but put would replace the
> data
> > in the node where invalidation should not have happened but it did,
> so
> > not a problem.
> > >
> > > Thoughts? Alternatively, maybe I need to shape my test so that I
> > wait for rehashing to finish, but the problem would still be there.
> >
> > Yes, this seems to be a bug with concurrent rehashing and
> invalidation
> > rather than HotRod.
> >
> > Could you modify your test to so the following:
> >
> > 1.  start 2 caches C1 and C2.
> > 2.  put a key K such that K maps on to C1 and C2
> > 3.  add a new node, C3.  K should now map to C1 and C3.
> > 4.  Modify the value on C1 *before* rehashing completes.
> > 5.  See if we see the stale value on C2.
> >
> > To do this you would need a custom object for K that hashes the way
> > you would expect (this could be hardcoded) and a value which blocks
> > when serializing so we can control how long rehashing takes.
> 
> Since logical addresses are used underneath and these change from one
> run to the other, I'm not sure how I can generate such key
> programatically. It's even more complicated to figure out a key that
> will later, when C3 starts, map to it. Without having these addresses
> locked somehow, or their hash codes, I can't see how this is doable.
> IOW, to be able to do this, I need to mock these addresses into giving
> fixed as hash codes. I'll dig further into this.
> 
> >
> > I never promised the test would be simple!  :)
> >
> > Cheers
> > Manik
> > --
> > Manik Surtani
> > manik at jboss.org
> > Lead, Infinispan
> > Lead, JBoss Cache
> > http://www.infinispan.org
> > http://www.jbosscache.org
> >
> >
> >
> >
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev



More information about the infinispan-dev mailing list