[jboss-dev-forums] [Design of JBossCache] - Re: JBCACHE-816 vs JGRP-105

Wed Nov 7 11:46:27 EST 2007

"manik.surtani at jboss.com" wrote : "galder.zamarreno at jboss.com" wrote : A month ago, Bela, Vladimir, Jimmy and myself were discussing my suggestions proposed in this forum thread. Apologies if I haven't been able to post the notes earlier.
  |   | 
  | 
  | I was wondering where this went!  :-)
  | 

I was hiding it secretly ;)

My fault :$

"manik.surtani at jboss.com" wrote : "galder.zamarreno at jboss.com" wrote : 
  |   | Interceptor vs listener vs HA-Singleton:
  |   | - based on a configuration option, an interceptor (preferred) or cache listener could be created that encapsulates solution; this is preferred to a standalone solution based on HA-Singleton.
  |   | - interceptor allows for greater flexibility and simplicity in terms of relaying data between dcs (data centres), filtering, transactional work and state transfer propagation than cache listener solution.
  |   | - this interceptor is only active if it's the coordinator of the local datacentre; use similar technique to what Singleton Store Cache Loader uses.
  |   | 
  | 
  | Why is the interceptor approach better than the cache listener one?  The reason why I say this is from an integration perspective, a cache listener is far less tightly coupled to JBC internals than an interceptor approach.
  | 
  | In terms of achieving goals, I don't see why this is hard:
  | 
  | 1)  A CL should be registered on every cache instance.
  | 2)  Querying viewChange events will tell the CL if it is the intra-ds coord (and hence whether or not to relay stuff to the inter-ds group
  | 3)  Registers a channel for the inter-ds group, listens for method invocations which it applies to the cache (if it is coord) or temporarily caches modifications in a Collection and removes them on commit (if it is the 2nd in line)
  | 

+1 on the basis that transactional callbacks are now available.

"manik.surtani at jboss.com" wrote : "galder.zamarreno at jboss.com" wrote : 
  |   | Relaying:
  |   | - in the context of this use case, relaying refers to inter dc communication.
  |   | - relaying could be done periodically, or per transaction/operation (put, remove, clear...etc).
  |   | - relaying would be asynchronous as data correctness is not paramount and it gives better performance.
  |   | 
  | 
  | Could use a replication queue.
  | 

Indeed. I'll look at the replication queue solution currently available within JBC and see if it could be used by the listener. Regardless, queue based replication (time or size) seems to fit this use case more than per operation/tx.

"manik.surtani at jboss.com" wrote : "galder.zamarreno at jboss.com" wrote : 
  |   | - inter dc relay ping pong effect should be avoided so that inter dc changes that are applied locally do not bounce back to other dcs. requires further thought on how to avoid it.
  |   | 
  | 
  | The Cache Listener would only relay stuff to other DCs if it is marked as being in the "active" dc, given than the dc switchover would be manual.  This would prevent the "dc ping pong" you describe.
  | 

Thought about such solution: By default, coordinator of inter-dc cluster is active or the relayer. User can change this manually at runtime to suit their needs. Any changes to who's the active relayer would require consensus in the cluster.

"manik.surtani at jboss.com" wrote : "galder.zamarreno at jboss.com" wrote : 
  |   | State Transfer:
  |   | - if a new node starts and it's first in local data centre, inter data centre interceptor is active and joins the mux channel, potentially requesting a state transfer from the coordinator. The coordinator of the inter dc cluster does not necessarily have to be the primary relayer, but inter dc cluster members should have the same data.
  |   | - if a new node starts and it's not first in local data centre, standard local state transfer rules apply.
  |   | - streaming state transfer at inter data centre level would require further thought as potential intermediate firewalls would come into play.
  | 
  | This needs careful thought, especially where blocking, WAN links and large states are concerned.  An entire new ds coming up could block the inter-ds group for quite a while, and this could throttle the inter-ds proxy on the active ds.  Even if this is async, if we're talking of several GBs of data over a WAN link, this could mean the inter ds group being blocked for hours, which could easily lead to the inter-ds proxy throwing OOMEs on queued async calls.
  | 
  | I think we need something better WRT state transfer before we can think of applying this to a WAN scenario.  Perhaps something where state is chunked and delivered in several bursts of, for example, under 50MB at a time, so that the group isn't blocked for too long.  I don't have a solution here (yet), just thinking aloud - I know it's not an easy problem to solve.

If we have the concept of relayer or active node in inter dc cluster, could a new node request the state from a non active node? We can't guarantee that all inter dc nodes will have the same state, but at least, if we request it from a non active inter dc node, we avoid the active node having to produce the state. This does not resolve the issue that the active node will not be able to forward messages while the state transfer is on going in the inter dc cluster, but at least reduces the burden on it. If there's no non-active node in the inter-dc cluster, then we'd have no other choice that request state from active.

This solution assumes that state transfer can be requested from any node in a cluster, not necessarily the coordinator. Not sure whether it's possible at this moment in time. It probably needs some coding in the listener to negotiate this.

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4102618#4102618

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4102618