On 19 Jul 2011, at 09:38, Galder Zamarreño wrote:
Hi all,
I've noticed a problem with the way we test state transfer in our testsuite. For
example, take
https://github.com/infinispan/infinispan/blob/master/core/src/test/java/o...
This test checks that when a new node is started, state transfer happens. But, it could
happen that a merge happens instead of a join, so if a merge happens no state transfer
occurs.
that was my understanding as well. And I think it still stands for REPL.
I've just had a chat with Dan and also looked at the code[1]: for distributed caches,
if a merge view happens, the rehashing is triggered in exactly the same way as when a join
happens. That worries me, as consistency is affected if a key is modified in one (or both)
of the cluster's partition after the split brain. Or am I missing something?
[1]
http://bit.ly/o9Cx99
Now, the problem is that waiting for the view to be set happens in the main thread, and
the callback to merge view listener happens in a different thread. So, in an unfortunate
situation, the following can happen:
1. [main-thread] waits for view to be set.
2. [main-thread] view is set due to a merge and main thread carries on.
3. [main-thread] checks the merge view listener and sees that
4. [callback-thread] calls MergedViewListener.mergedView and sets merged=true.
I've seen this failure happening in my local machine when trying to replicate other
random failures.
So, I'm solving this issue by having a listener that listens for both merge and view
changes, and then having a latch that can waits for either one of the two callbacks to
happen.
Clearly, the countDown() would happen after either merged (boolean) or
viewChanged(boolean) have been set, so that gives the guarantees that either a merge
happened or not and then I can check the initial data if necessary. I'll send a pull
req later on today with this.
Btw, you might be wondering how on earth a merge would happen with our new TEST_PING? I
have that question too and seems like sometimes Discovery.sendGetMembersRequest does not
get called and my TEST_PING implementation assumes that will definitely get called.
I've sent an email to Bela and I'm gonna try to add some more debugging to find
out how on earth this happens.
I think that even if you fix this in the unit tests
it still might happen in a real-life situation, i.e. start two nodes and instead of
forming a cluster they'd first form two clusters and then merge.
Now, what I'm wondering here is whether this is something the end users would be
interested as well cos they might be running their own testing to check whether state
transfer works as expected.
Thinking loud about this issue: can't jgroups
realise that this node wants to join and not to merge? E.g. if node B starts and wants to
join cluster {A}: if B hasn't received any application level messages than can't
jgroups just assume that it definitely wants to join, and never wants to merge?
It's at this point that I miss tuples in java cos I think it'd be handy to have a
getMembers() call that returns not only a List<Address> but also an enum value
indicating whether the last view change was a merge or view change, or more simply a
boolean indicating whether the last view change was a merge or not:
(boolean, List<Members>) getMembers();
Unfortunately Java does not make it easy to return things like this. Having a separate
method to find out if the last view change was a merge or not would be clunky cos
you'd want a single method that can provide the guarantees with regards to the member
list returned.
Any thoughts on this API enhancement? Would it be useful? In the Java world, it would
require creating a new type which is a bit of a deterrent for me.
I'm not sure
how useful this information is *after* the event (be it merge or join) took place. Beside
this very specific use case, I cannot think of another in which a user wants to know the
type of view change *after* it took place..
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev