[infinispan-dev] asynchronous change manager for high load Lucene

Bela Ban bban at redhat.com
Mon Mar 29 01:33:20 EDT 2010


I'm commenting on this without being a Lucene Hibernate Search expert, 
so bear with me...

Sanne Grinovero wrote:
> The Lucene Directory is doing well in read-most situations, and a nice
> to have for huge sized indexes to easily sync them in dynamic-topology
> clusters.
> But when the many participant nodes all potentially apply changes, a
> global index lock needs to be acquired; this is safely achievable by
> using the LockFactory implementations provided in the lucene-directory
> package but of course doesn't scale.
>
> In practice, it's worse than a performance problem: Lucene expects
> full ownership of all resources, and so the IndexWriter implementation
> doesn't expect the Lock to timeout, and has no notion of fairness in
> the wait process, so if your architecture does something like "each
> node can ask for the lock and apply changes" without some external
> coordination, this only works in case the contention on this lock is
> low enough; if it piles up, it will blow up the application with
> exceptions on indexing.
>
> Best solution I've seen so far is what Emmanuel implemented years ago
> for Hibernate Search by using JMS to send changes to a single master
> node. Currently I think the state-of-the-art installation should
> combine such a queue-like solution to delegate all changes to a single
> node, and this single node should apply the changes to an Infinispan
> Directory - so making changes to all other nodes visible through
> efficient Infinispan distribution/replication.

By sending the changes to a single master, who then applies them, you 
essentially pick a total order concept: all changes are applies in the 
same order across the cluster. Total order eliminates the need to 
acquire a lock, because the changes from different nodes will always be 
seen in the same order, therefore you don't need to lock.

This could also be done by using the SEQUENCER protocol in JGroups: say 
A, B and C need to update an index. If B wanted to make an update, it 
would have to acquire the cluster wide lock, update the index, replicate 
across the cluster, and the release the lock. The lock is needed to make 
sure that B's changes don't overlap with C's or A's changes.

However, if we guaranteed that everybody would receive B's modification, 
followed by A's modification, followed by C's modification, we would not 
need a lock at all !

In other words, if we have to apply a change across a cluster, with 1 
message, total order can replace a two-phase lock-unlock protocol.

> Now I would like to use Infinispan to replace the JMS approach,
> especially as in cloud environments it's useful that the different
> participants which make up the service are all equally configured:
> having a Master Writer node is fine as long as it's auto-elected.
> (not mandatory, just very nice to have)
>
> The problems to solve:
>
> A) Locking
> The current locking solution is implemented by atomically adding a
> marker value in the cache. I can't use transactions as the lock could
> span several transactions and must be visible.

The issue with this is that you (ab)use the cache as a messaging / 
signalling system ! This is not recommended. A better approach would be 
to use a separate channel for communication, or to even reuse the 
channel Infinispan uses. The latter is something I've been discussing 
with Brian.

> It's Lucene's responsibility to clear this lock properly, and I can
> trust it for that as far as the code can do. But what happens if the
> node dies? Other nodes taking over the writer role should be able to
> detect the situation and remove the lock.

You get a new view and release all locks held by nodes which are not 
part of the new view. This is how it's implemented in JBossCache and 
Infinispan today.

> Proposals:
>
> - A1) We don't lock, but find a way to elect one and only one writer
> node. This would be very cool, but I have no idea about how to 
> implement it.

You could use the cluster view, which is the same across all nodes, and 
pick the first element in the list. Or you could run an agreement 
protocol, which deterministically elects a master.

> - A2) We could store the node-address as value in the marker object,
> if the address isn't part of the members the lock is cleared. (can I
> trust the members view? the index will corrupt if changed by two
> nodes)

Not so good, don't abuse the cache as a communication bus

> Proposal:
>
> - B1) Use JMS or JGroups directly (as Hibernate Search is currently
> capable to use both), again I face the problem of IndexWriter node
> election, and have the messages sent to the correct node.
> In both cases I would like to receive enough information from
> Infinispan to know where to send messages from the queue.

I'd use JGroups (of course I'm biased :-)), but JMS really has no means 
of cluster views to elect a master. Plus, you could use SEQUENCER as a 
means to establish a global order between messages, so maybe the need to 
acquire a lock could be dropped altogether.



-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss



More information about the infinispan-dev mailing list