[jboss-dev-forums] [Design of Clustering on JBoss (Clusters/JBoss)] - Re: Handling cluster state when network partitions occur

Thursday, 13 September 2007

I think the primary partition approach is best.  Caches not in the primary partition
purging their in memory state is probably the wrong path though, since as a generic
solution, not all installations will be backed by shared databases.

Caches shutting down would be my preferred option.  Perhaps block for a short period,
hoping the network would heal, and then throw an exception after a timeout.  Perhaps a
specific exception - SplitBrainException or something - so that cache users such as HTTP
Replication can react by forcing an HTTP response like 410 (don't know if this is
possible - Brian?) such that the load balancer will treat the node as unavailable.  Once
the partition heals the cache is made available to requests again after performing a state
transfer to come up to speed with the primary partition.

Even the impact of incorrectly identifying a primary partition is low, since at worst
case, the larger partition is unresponsive while the smaller one is.  I guess the real
problem is more than one partition thinking it is primary.  :-)

View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4084139#...

Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[jboss-dev-forums] [Design of Clustering on JBoss (Clusters/JBoss)] - Re: Handling cluster state when network partitions occur