[infinispan-dev] Rethinking asynchronism in Infinispan

Wed Jan 13 09:22:01 EST 2010

So I've been spending some time thinking about how we deal with async tasks in Infinispan, both from an API perspective as well as an implementation detail, and wanted to throw a few ideas out there.

First, lets understand the 4 most expensive things that happen in Infinispan, either simply expensive or things that could block under high contention (in descending order): RPC calls, marshalling, CacheStore and locking.  

We deal with asynchronism in a somewhat haphazard way at the moment, each of these functions receiving a somewhat different treatment:

1)  RPC: Using JGroups' ResponseMode of waiting for none.
2) Marshalling: using an async repl executor to take this offline
3) Use an AsyncStore wrapper which places tasks in an executor
4) Nothing

and to add to it,

5) READs are never asynchronous.  E.g., no such thing as an async GET - even if it entails RPC or a CacheStore lookup (which may be a remote call like S3!)

The impact of this approach is that end users never really get to benefit from the general asynchronism in place, and still needs to configure stuff in several different places.  And internally, it makes dealing with internal APIs hard.

Externally, this is quite well encapsulated in the Cache interface, by offering async methods such as putAsync(), etc. so there would be little to change here.  They return Futures, parameterized to the actual method return type, e.g., putAsync returns Future<V> in a parameterized Cache<K, V>.  (More precisely, they return a NotifyingFuture, a sub-interface of Future that allows attaching listeners, but that's a detail.)

So I think we should start with that.  The user receives a Future.  This Future is an aggregate Future, which should aggregate and block based on several sub-Futures, one for each of the tasks (1 ~ 4) outlined above.  Now what is the impact of this?  Designing such a Future is easy enough, but how would this change internal components?

1)  RPC.  Async RPC is, IMO, broken at the moment.  It is unsafe in that it offers no guarantees that the calls are received by the recipient, and you have no way of knowing.  So RPC should always be synchronous, but wrapped in a Future so that it is taken offline and the Future can be checked for success.

2) Marshalling happens offline at the moment and a Future is returned as it stands, but this could probably be combined with RPC to a single Future since this step is logically before RPC, and RPC relies on marshalling to complete.

3) The CacheStore interface should only return parameterized Futures.  Implementations that do not offer any async support may simply return a wrapped value.  Calls to this interface that require immediate results would simply poll the Future.  But in all other cases - ranging from the AsyncStore wrapper to more sophisticated async backends such as JClouds, we get to harness the async nature offered by the impl.

4) Locking could be async, but understanding context is important here.  We currently use 2 different types of Locks: JDK reentrant locks for non-transactional calls, or OwnableReentrantLock for transactional calls where the lock owner is not the thread but instead the GlobalTransaction.  For this to work, any invocation context should have reference to the original invoking thread, OwnableReentrantLocks should always be used, and the owner set accordingly, to ensure an Executor thread doesn't gain ownership of a lock instead.

The end result of all this, is that async API calls will be truly non-blocking, and will be able t harness as much parallelism as possible in the system.  Another effect is that we'd have a cleaner internal API which is specifically designed with async operations in mind, and converting any of the async calls to sync calls is trivial (Future.get()) anywhere along the invocation chain.  Also, I think once we have this in place, any async transport modes (REPL_ASYNC, DIST_ASYNC) should be deprecated in favour of an async API, since you get the ability to check that a call completes even though it happens offline.

And a disclaimer: I have no plans to implement anything like this soon, I was just throwing this out for comment and distillation.  :)

Cheers
--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org