[jboss-jira] [JBoss JIRA] Created: (JBCACHE-1349) Buddy replication state transfer fails if a marshalling region is empty

Brian Stansberry (JIRA) jira-events at lists.jboss.org
Sat May 17 10:06:32 EDT 2008


Buddy replication state transfer fails if a marshalling region is empty
-----------------------------------------------------------------------

                 Key: JBCACHE-1349
                 URL: http://jira.jboss.com/jira/browse/JBCACHE-1349
             Project: JBoss Cache
          Issue Type: Bug
      Security Level: Public (Everyone can see)
          Components: Buddy Replication
    Affects Versions: 2.1.1.GA
            Reporter: Brian Stansberry
         Assigned To: Manik Surtani


Scenario: buddy replication is enabled, along with region-based marshalling. On one peer a region has been activated, but no root node for the region created.  Then a new peer joins the cluster, triggering a state transfer push from the existing peer.  Fails with the following exception:

2008-05-16 17:50:52,986 ERROR [org.jboss.cache.buddyreplication.BuddyManager] (AsyncViewChangeHandlerThread,127.0.0.1:35801) Caught exception handling view change
org.jboss.cache.CacheException: Error acquiring state
	at org.jboss.cache.buddyreplication.BuddyManager.acquireState(BuddyManager.java:914)
	at org.jboss.cache.buddyreplication.BuddyManager.addBuddies(BuddyManager.java:802)
	at org.jboss.cache.buddyreplication.BuddyManager.reassignBuddies(BuddyManager.java:409)
	at org.jboss.cache.buddyreplication.BuddyManager.access$800(BuddyManager.java:56)
	at org.jboss.cache.buddyreplication.BuddyManager$AsyncViewChangeHandlerThread.handleEnqueuedViewChange(BuddyManager.java:1162)
	at org.jboss.cache.buddyreplication.BuddyManager$AsyncViewChangeHandlerThread.run(BuddyManager.java:1106)
	at java.lang.Thread.run(Thread.java:595)
Caused by: org.jboss.cache.CacheException: Cache instance at 127.0.0.1:35801 cannot provide state for fqn /sfsb/ear=clusteredsession-local.jar,jar=clusteredsession-local.jar,name=ClusteredStateful,service=EJB3. There is no cache node at fqn /sfsb/ear=clusteredsession-local.jar,jar=clusteredsession-local.jar,name=ClusteredStateful,service=EJB3
	at org.jboss.cache.statetransfer.StateTransferManager.getState(StateTransferManager.java:121)
	at org.jboss.cache.buddyreplication.BuddyManager.generateState(BuddyManager.java:966)
	at org.jboss.cache.buddyreplication.BuddyManager.acquireState(BuddyManager.java:897)
	... 6 more

This is because StateTransferManager.getState() is designed to throw CacheException if the region is inactive (not the case) or has no data (the case here).  This exception was really designed as a signal to propagate to a *total replication* state transfer *requestor* that there is no state on this peer (so the requestor can ask another peer).  But the buddy replication code isn't handling it and a single region like this breaks the whole state transfer.

Some thoughts:

1) A specialized CacheException subclass should be created for this "signal"; plain CacheException is too generic.
2) It seems the case of "region inactive" is different from "no data". I don't really think "no data" is an exception, it's just a specialized type of state.  That is, in the total replication case, the code is designed to catch this special "exception due to an inactive region" and go on to ask another peer (who may be active).  I see no reason to ask another peer for the state if one peer has an active region but no data. The requestor should just initialize an empty region.
2) The BR code should catch this exception.
3) Possibly, the BR code should send the exception to the new peer as part of the state transfer data. That is, don't just swallow it, as the new peer may have old, stale, persistent data in its buddy backup tree; need to tell the peer to discard that data.

I found this working with AS 5 EJB3 SFSB code, but it's not a critical issue for me due to the simple workaround of just making sure the root node for the region exists.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list