[
https://jira.jboss.org/jira/browse/JBCACHE-1530?page=com.atlassian.jira.p...
]
Brian Stansberry commented on JBCACHE-1530:
-------------------------------------------
Also discussed on IM:
<brian> same scenario, but the node that gravitated the session has also picked the
first node as its new buddy
<brian> so, now that first node has 2 copies of session
<brian> under /B_B/xxx:DEAD/1/foo and /B_B/yyy/foo
<brian> later it gets a GravitateDataCommand and responds with the wrong one
Workaround I'm going to try implementing is to give preference to non-DEAD trees when
performing to GravitateDataCommand. Basically sort the set of /BUDDY_BACKUP children names
and scan through the sorted set.
Stale copies of gravitated data left in "xxx:DEAD trees
-------------------------------------------------------
Key: JBCACHE-1530
URL:
https://jira.jboss.org/jira/browse/JBCACHE-1530
Project: JBoss Cache
Issue Type: Bug
Security Level: Public(Everyone can see)
Affects Versions: 3.2.0.BETA1
Reporter: Brian Stansberry
Assignee: Manik Surtani
Fix For: 3.2.0.GA
There's a race that can result in a stale copy of gravitated data being left in a
"/BUDDY_BACKUP/xxx:DEAD" tree:
Scenario:
1) Data is store in /B_B/xxx/foo
2) xxx is leaving group
3) Gravitate data command for /B_B/xxx/foo comes in, result is returned
4) Buddy group re-formation thread move /B_B/xxx/foo to /B_B/xxx:DEAD/1/foo
5) DataGravitationCleanupCommand comes in for /B_B/xxx/foo which results in nothing
happening since the node is moved
A fix might involve some analysis of the backup fqn in the DataGravitationCleanupCommand,
try to detect this condition. Or perhaps tracking successful GravitationResult responses,
trying to match against the cleanup command.
As a quick workaround I'm going to investigate an algorithm on the
GravitateDataCommand sender side to not just accept the first successful result but rather
to compare all positive results, giving preference to:
1) A result from the main tree
2) A result from a non :DEAD buddy backup tree
3) A result from xxx:DEAD/2
4) A result from xxx:DEAD/1
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira