Hi all,
I've noticed a problem with the way we test state transfer in our testsuite. For
example, take
https://github.com/infinispan/infinispan/blob/master/core/src/test/java/o...
This test checks that when a new node is started, state transfer happens. But, it could
happen that a merge happens instead of a join, so if a merge happens no state transfer
occurs.
Now, the problem is that waiting for the view to be set happens in the main thread, and
the callback to merge view listener happens in a different thread. So, in an unfortunate
situation, the following can happen:
1. [main-thread] waits for view to be set.
2. [main-thread] view is set due to a merge and main thread carries on.
3. [main-thread] checks the merge view listener and sees that
4. [callback-thread] calls MergedViewListener.mergedView and sets merged=true.
I've seen this failure happening in my local machine when trying to replicate other
random failures.
So, I'm solving this issue by having a listener that listens for both merge and view
changes, and then having a latch that can waits for either one of the two callbacks to
happen. Clearly, the countDown() would happen after either merged (boolean) or
viewChanged(boolean) have been set, so that gives the guarantees that either a merge
happened or not and then I can check the initial data if necessary. I'll send a pull
req later on today with this.
Btw, you might be wondering how on earth a merge would happen with our new TEST_PING? I
have that question too and seems like sometimes Discovery.sendGetMembersRequest does not
get called and my TEST_PING implementation assumes that will definitely get called.
I've sent an email to Bela and I'm gonna try to add some more debugging to find
out how on earth this happens.
Now, what I'm wondering here is whether this is something the end users would be
interested as well cos they might be running their own testing to check whether state
transfer works as expected.
It's at this point that I miss tuples in java cos I think it'd be handy to have a
getMembers() call that returns not only a List<Address> but also an enum value
indicating whether the last view change was a merge or view change, or more simply a
boolean indicating whether the last view change was a merge or not:
(boolean, List<Members>) getMembers();
Unfortunately Java does not make it easy to return things like this. Having a separate
method to find out if the last view change was a merge or not would be clunky cos
you'd want a single method that can provide the guarantees with regards to the member
list returned.
Any thoughts on this API enhancement? Would it be useful? In the Java world, it would
require creating a new type which is a bit of a deterrent for me.
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache