At the moment, when using buddy replication and a data owner fails, the backup resides on
a buddy instance and is only gravitated into the primary tree of an instance when someone
makes a request for that data.
It has been designed this way to minimise network traffic and load during the crash of a
server, which could cause a network storm if there is a lot of state that would need to be
reorganised across a cluster.
Despite that, people have asked for an option to allow an eager push of backup state to a
new data owner - or even the assigned buddy taking on the data as though it were the owner
- and creating appropriate backups.
Here are some thoughts on how this can be implemented:
* When an instance fails, this option forces the buddy to take ownership of the failed
nodeâs state.
* Should wait a defined amount of time first, to allow for gravitation calls to
organically move data.
* Donât need to block data gravitation calls when taking ownership since DG will look
in both primary and backup trees
* Could be in chunks to prevent a network storm (since the new node taking ownership will
be backing stuff up as well). Would need some additional "hints" as to which
subtrees should be considered "related" data though.
* When an instance failure is detected, buddies should rename the region such to prevent
the original instance re-appearing and overwriting backed up state. E.g., rename
/_B_B_/CacheA/ to /_B_B_/CacheA:dead_n/ where n is a counter since A could die and rejoin
several times before the state is re-owned.
Thoughts and comments? How important/useful do you think this is in the first place?
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4135090#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...