[infinispan-dev] heuristic transactions & failure recovery

Mon Apr 6 05:18:43 EDT 2009

On 4 Apr 2009, at 16:16, Mircea Markus wrote:

> Hi,
>
> Current implementation of tx in JBC/infinispan might result in  
> heuristic transactions: e.g. if the coordinator cannot send an  
> commit message (2nd phase from 2PC) within a given timeout to some  
> of the participants, this might results in data being committed on  
> some nodes and rollbacked on other.

?  If the coord (and I assume you mean the transaction coordinator,  
not the JGroups channel coordinator) doesn't broadcast a commit, none  
of the other nodes would have committed this state.  I don't see how  
you have a situation where it is committed on some and rolled back on  
others.

Perhaps you mean if the tx coordinator has broadcast a commit, some  
receive the commit and before all receive the commit the tx  
coordinator dies.  And you are not using multicast (if you are they  
all receive the commit message at the same time).  But we recommend  
you use multicast anyway so I'm not so sure if this is such a problem.

> Even worse, there is no way to take action and recover from the  
> failure. Would it make sense to have tx failure recovery  mechanism  
> in  infinispan?

Well, it depends.  If it is used as a cache for a db, then "recovery"  
is to just empty the cache.  Otherwise, if you want to treat it as a  
distributed in-memory db, "recovery" here would mean emptying the  
cache instance in question, and doing a state transfer from a  
neighbour (REPL) or re-hashing keys (DIST).

>  I'm referring  here to something similar to the way DBs work, i.e.  
> based on an persistent tx logs, external notifications etc? Even  
> though I didn't see any such request on forums, I guess such a  
> feature is mandatory for certain systems, e.g. a financial  
> application. Wdyt?

Persistent tx logs can be just as error-prone, unless you checkpoint  
open files to disk via OS system calls to ensure all kernel and  
hardware caches are flushed.  But this is *very* slow.

AFAIK the way DBs do this - including Oracle - is to checkpoint at  
intervals, but this still allows for windows where your persistent tx  
log could be out of date or corrupt.

Cheers
--
Manik Surtani
Lead, JBoss Cache
http://www.jbosscache.org
manik at jboss.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20090406/5616e103/attachment-0001.html