Forum thread to capture some thoughts from a phone discussion today and to continue the
conversation. Topic is how stateful clustered applications should deal with network
partitions. See necessary background reading at
http://www.jgroups.org/javagroupsnew/docs/manual/html/user-advanced.html#... for an
overview of the problem.
For discussion brevity, I'll refer to 3 of the ideas mentioned in the JGroups doc by
shorted:
1) "Merge state". Discussed in 5.6.1 -- JGroups user manages a process to merge
the state from the merging partitions.
2) "Primary partition". Discussed in 5.6.2. Nodes that realize upon merge that
they weren't in the primary partition flush their state, ensuring that state must be
reacquired from a shared persistent store.
3) "Read-only". Nodes realize at the time the partition occurs that they are not
in the primary partition, and shift into read-only mode, throwing exceptions if an attempt
is made to write. When the partition heals, they get the current state from the primary
partition.
Some things mentioned in the discussion today:
* Using timestamps for "merge state" is problematic due to clock synchronization
and clock granularity issues.
* "Merge state" is tricky for PojoCache, because the stored pojo doesn't
have a natural timestamp or counter that can be used to determine which state to retain.
JBC/PojoCache can compare state between nodes, but if there is a conflict there probably
needs to be a callback to the application to allow it to decide what to do.
* "Merge state" is more useful to http sessions that don't use FIELD
granularity (i.e. don't use PojoCache). Web sessions are supposed to be managed by
only one node, so the difficulty of deciding which node's state is correct is reduced.
The stored sessions have a timestamp and counter stored in the root node for the sessions
subtree. If JBC provided a callback to the JBossCacheManager, the JBCM could easily
determine which server's data to retain for each session. Bela pointed out that this
reconciliation process could also be done "off-line" via a special service that
multiplexes on the cache's channel -- i.e. JBC doesn't have to be directly
involved.
* "Read-only" is a possible option for some webapps that don't mind throwing
exceptions. E.g. a blog site wants people to be able to continue reading the blog, and
doesn't mind if update functions are disrupted during the network partition.
* Entity caching is a tough problem. At first, something like the "primary
partition" approach seems good; just dump cached state when the merge occurs and
force Hibernate to repopulate the cache from the db. Problem is that during the partition
itself the caches can become invalid. Two sub-partitions of {A, B, C} and {D, E} will not
know about each others' updates, and thus Hibernate can read stale data from the
cache. Variations on the "read-only" approach where {D, E} no longer provides
cached data don't solve the problem, as Hibernate on D or E can still update the
database, in which case the {A, B, C} caches are invalid.
For this kind of situation, having the application become unavailable on {D, E} is
something we need to consider.
This is just a quick summary; I'm sure I missed stuf.
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4079307#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...