[infinispan-dev] Rethinking asynchronism in Infinispan

Wed Jan 13 12:13:21 EST 2010

Manik Surtani wrote:
> So I've been spending some time thinking about how we deal with async 
> tasks in Infinispan, both from an API perspective as well as an 
> implementation detail, and wanted to throw a few ideas out there.
>
> First, lets understand the 4 most expensive things that happen in 
> Infinispan, either simply expensive or things that could block under 
> high contention (in descending order): RPC calls, marshalling, 
> CacheStore and locking

In my experience, RPC calls (at least async ones) should be much less 
costly than CacheStore and *un*-marshalling (marshalling *is* usually 
fast). Re CacheStore: writing / reading to a disk is slower than even a 
network round trip.

> We deal with asynchronism in a somewhat haphazard way at the moment, 
> each of these functions receiving a somewhat different treatment:
>
> 1) RPC: Using JGroups' ResponseMode of waiting for none.
> 2) Marshalling: using an async repl executor to take this offline

The problem here is that you're pushing the problem of marshalling 
further down the line. *Eventually* data has to be marshalled, and 
somebody *has* to block ! IIRC, you used a bounded queue to place 
marshalling tasks onto, so for load peaks this was fine, but for 
constant high load, someone will always block on the (full) queue.

> 3) Use an AsyncStore wrapper which places tasks in an executor

Similar issue to above: at some point the thread pool might be full. 
Then you need to start discarding tasks, or block.

Both (2) and (3) handle temporary spikes well though...

> 4) Nothing
>
> and to add to it,
>
> 5) READs are never asynchronous. E.g., no such thing as an async GET - 
> even if it entails RPC or a CacheStore lookup (which may be a remote 
> call like S3!)
>
> The impact of this approach is that end users never really get to 
> benefit from the general asynchronism in place

What is this general asynchronism ? From my view of Infinispan, I don't 
see a bias towards asynchronous execution, but I see an API which 
supports both async and sync execution.

> and still needs to configure stuff in several different places. And 
> internally, it makes dealing with internal APIs hard.
>
> Externally, this is quite well encapsulated in the Cache interface, by 
> offering async methods such as putAsync(), etc. so there would be 
> little to change here. They return Futures, parameterized to the 
> actual method return type, e.g., putAsync returns Future<V> in a 
> parameterized Cache<K, V>. (More precisely, they return a 
> NotifyingFuture, a sub-interface of Future that allows attaching 
> listeners, but that's a detail.)
>
> So I think we should start with that. The user receives a Future. This 
> Future is an aggregate Future, which should aggregate and block based 
> on several sub-Futures, one for each of the tasks (1 ~ 4) outlined 
> above. Now what is the impact of this? Designing such a Future is easy 
> enough, but how would this change internal components?

Are you suggesting the external API remains the same, but internally 
futures are used ? Or do you suggest to make use of futures mandatory ?

Can you show a pesudo code sample ?

> 1) RPC. Async RPC is, IMO, broken at the moment. It is unsafe in that 
> it offers no guarantees that the calls are received by the recipient

No, JGroups guarantees message delivery ! Besides that, async APIs *are* 
by definition fire-and-forget (JMS topics), so IMO this is not broken !

Or do you have something akin to persistent JMS messages in mind ?

> and you have no way of knowing.

Yes, but that's the name of the game with *async* RPCs ! If you want to 
know, use sync RPCs...

> So RPC should always be synchronous, but wrapped in a Future so that 
> it is taken offline and the Future can be checked for success

Who would check on the future, e.g. pseudo code like this:

Future<String> future=cache.putWithFuture("name", "value");
String prev_val=future.get();

doesn't help, as there is no work done before the future is checked.

Sync RPCs are a magnitude slower than async ones, so unless you call 
1000 sync RPCs, get 1000 futures and then check on the futures, I don't 
see the benefits of this. And code like this will be harder to write.

> 2) Marshalling happens offline at the moment and a Future is returned 
> as it stands, but this could probably be combined with RPC to a single 
> Future since this step is logically before RPC, and RPC relies on 
> marshalling to complete

Interesting, can you elaborate more ?

-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss