[jboss-jira] [JBoss JIRA] Updated: (JBAS-7186) SuspectExceptions during data gravitation lead to DataGravitationCleanup command not executing
Brian Stansberry (JIRA)
jira-events at lists.jboss.org
Tue Mar 9 16:43:58 EST 2010
[ https://jira.jboss.org/jira/browse/JBAS-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brian Stansberry updated JBAS-7186:
-----------------------------------
Fix Version/s: JBossAS-6.0.0.M4
(was: JBossAS-6.0.0.M3)
> SuspectExceptions during data gravitation lead to DataGravitationCleanup command not executing
> ----------------------------------------------------------------------------------------------
>
> Key: JBAS-7186
> URL: https://jira.jboss.org/jira/browse/JBAS-7186
> Project: JBoss Application Server
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Clustering
> Affects Versions: JBossAS-5.1.0.GA
> Reporter: Brian Stansberry
> Assignee: Brian Stansberry
> Fix For: JBossAS-6.0.0.M4
>
>
> JBC is different from JBC 1.4 in how it handles suspected nodes during data gravitation. JBC would ignore them; w/ JBC 3 they propagate.
> These can happen easily with a cluster under load and a node failing. LB detects failure before view changes, node that has failing node as it backup starts gravitating, replication of the gravitated data to the (failed) backup throws a SuspectException.
> The clustering integration needs to handle this better. Right now gravitation attempts are wrapped in txs, so the SuspectException fails the tx commit. That's pretty non-recoverable unless we catch the commit failure and retry. A possibility is to not wrap the gravitation in a tx (not really needed except for FIELD) and use JBossCacheWrapper's get() retry logic to redo the gravitation.
> We already catch the exception and allow the request to continue if the data was actually retrieved. Actually, that only works because we wrap w/ the tx; JBC wouldn't return from the gravitation read without the tx causing the replication write to wait for tx commit. Hmm...
> Problem this causes now is 1) gravitated data doesn't replicate to buddy until a request changes it and causes a normal write 2) DataGravitationCleanupCommand is not issued, so stale data is left in the cache. Some of the changes made for JBCACHE-1530 reduce the likelihood of that stale data being used; it's only used if a request fails over to the node where it's stored, leading to gravitation from (stale) local backup tree.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list