Re: [infinispan-dev] asynchronous change manager for high load Lucene

Monday, 29 March 2010

I'm commenting on this without being a Lucene Hibernate Search expert, 
so bear with me...

Sanne Grinovero wrote:
...
 The Lucene Directory is doing well in read-most situations, and a
nice
 to have for huge sized indexes to easily sync them in dynamic-topology
 clusters.
 But when the many participant nodes all potentially apply changes, a
 global index lock needs to be acquired; this is safely achievable by
 using the LockFactory implementations provided in the lucene-directory
 package but of course doesn't scale.

 In practice, it's worse than a performance problem: Lucene expects
 full ownership of all resources, and so the IndexWriter implementation
 doesn't expect the Lock to timeout, and has no notion of fairness in
 the wait process, so if your architecture does something like "each
 node can ask for the lock and apply changes" without some external
 coordination, this only works in case the contention on this lock is
 low enough; if it piles up, it will blow up the application with
 exceptions on indexing.

 Best solution I've seen so far is what Emmanuel implemented years ago
 for Hibernate Search by using JMS to send changes to a single master
 node. Currently I think the state-of-the-art installation should
 combine such a queue-like solution to delegate all changes to a single
 node, and this single node should apply the changes to an Infinispan
 Directory - so making changes to all other nodes visible through
 efficient Infinispan distribution/replication. 
By sending the changes to a single master, who then applies them, you 
essentially pick a total order concept: all changes are applies in the 
same order across the cluster. Total order eliminates the need to 
acquire a lock, because the changes from different nodes will always be 
seen in the same order, therefore you don't need to lock.

This could also be done by using the SEQUENCER protocol in JGroups: say 
A, B and C need to update an index. If B wanted to make an update, it 
would have to acquire the cluster wide lock, update the index, replicate 
across the cluster, and the release the lock. The lock is needed to make 
sure that B's changes don't overlap with C's or A's changes.

However, if we guaranteed that everybody would receive B's modification, 
followed by A's modification, followed by C's modification, we would not 
need a lock at all !

In other words, if we have to apply a change across a cluster, with 1 
message, total order can replace a two-phase lock-unlock protocol.

...
 Now I would like to use Infinispan to replace the JMS approach,
 especially as in cloud environments it's useful that the different
 participants which make up the service are all equally configured:
 having a Master Writer node is fine as long as it's auto-elected.
 (not mandatory, just very nice to have)

 The problems to solve:

 A) Locking
 The current locking solution is implemented by atomically adding a
 marker value in the cache. I can't use transactions as the lock could
 span several transactions and must be visible. 
The issue with this is that you (ab)use the cache as a messaging / 
signalling system ! This is not recommended. A better approach would be 
to use a separate channel for communication, or to even reuse the 
channel Infinispan uses. The latter is something I've been discussing 
with Brian.

...
 It's Lucene's responsibility to clear this lock properly, and
I can
 trust it for that as far as the code can do. But what happens if the
 node dies? Other nodes taking over the writer role should be able to
 detect the situation and remove the lock. 
You get a new view and release all locks held by nodes which are not 
part of the new view. This is how it's implemented in JBossCache and 
Infinispan today.

...
 Proposals:

 - A1) We don't lock, but find a way to elect one and only one writer
 node. This would be very cool, but I have no idea about how to 
 implement it. 
You could use the cluster view, which is the same across all nodes, and 
pick the first element in the list. Or you could run an agreement 
protocol, which deterministically elects a master.

...
 - A2) We could store the node-address as value in the marker object,
 if the address isn't part of the members the lock is cleared. (can I
 trust the members view? the index will corrupt if changed by two
 nodes) 
Not so good, don't abuse the cache as a communication bus

...
 Proposal:

 - B1) Use JMS or JGroups directly (as Hibernate Search is currently
 capable to use both), again I face the problem of IndexWriter node
 election, and have the messages sent to the correct node.
 In both cases I would like to receive enough information from
 Infinispan to know where to send messages from the queue. 
I'd use JGroups (of course I'm biased :-)), but JMS really has no means 
of cluster views to elect a master. Plus, you could use SEQUENCER as a 
means to establish a global order between messages, so maybe the need to 
acquire a lock could be dropped altogether.

-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] asynchronous change manager for high load Lucene