Hi,
I have gone through some of the failures I'm getting when running the
test suite for 1.4.x:
* Some buddy replication tests show this test failures:
junit.framework.AssertionFailedError: buddy's list of groups it
participates in should contain data owner's group name at
org.jboss.cache.buddyreplication.BuddyReplicationTestsBase.assertIsBuddy(BuddyReplicationTestsBase.java:280)
Earlier in the test, you can see this Exception:
org.jboss.cache.buddyreplication.BuddyNotInitException: Not yet initialised
at
org.jboss.cache.buddyreplication.BuddyManager.handleAssignToBuddyGroup(BuddyManager.java:450)
at org.jboss.cache.TreeCache._remoteAssignToBuddyGroup(TreeCache.java:5372)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.jgroups.blocks.MethodCall.invoke(MethodCall.java:330)
This failure seems to be a related to a timing issue, when a node
requests another to join the buddy group but this node is still in the
initialisation process, BM.init(). More precisely, the code is
broadcasting buddy pool membership, and at that point a, request to
joing the buddy groups occurs.
The channel is connected before the buddy manager is initialised, so
there's always the possibility of receiving messages before buddy
manager has finished initialising.
Various solutions that come to my mind:
1.- staggering cache starts
2.- wait for a little bit before throwing BuddyNotInitException seeing
that the broadcasting task could be lengthy (i.e. one of the nodes fails
to respond as the call is synchronous), maybe wait for
buddyCommunicationTimeout?
IMO, 2 is preferred.
--
Galder ZamarreƱo
Sr. Software Maintenance Engineer
JBoss, a division of Red Hat