[infinispan-dev] Remote command smarter dispatcher (merge ISPN-2808 and ISPN-2849)
Pedro Ruivo
pedro at infinispan.org
Wed Mar 20 06:59:33 EDT 2013
On 03/20/2013 07:53 AM, Dan Berindei wrote:
>
> On Tue, Mar 19, 2013 at 11:55 PM, Pedro Ruivo <pedro at infinispan.org
> <mailto:pedro at infinispan.org>> wrote:
>
>
>
> On 03/19/2013 08:41 PM, Dan Berindei wrote:
> >
> > On Mon, Mar 18, 2013 at 6:09 PM, Pedro Ruivo
> <pedro at infinispan.org <mailto:pedro at infinispan.org>
> > <mailto:pedro at infinispan.org <mailto:pedro at infinispan.org>>> wrote:
> >
> > Hi all,
> >
> > To solve ISPN-2808 (avoid blocking JGroups threads in order
> to allow to
> > deliver the request responses), I've created another thread
> pool to move
> > the possible blocking commands (i.e. the commands that may
> block until
> > some state is achieved).
> >
> > Problem description:
> >
> > With this solution, the new thread pool should be large in
> order to be
> > able to handle the remote commands without deadlocks. The
> problem is
> > that all the threads can be block to process the command that may
> > unblock other commands.
> >
> > Example: a bunch of commands are blocked waiting for a new
> topology ID
> > and the command that will increment the topology ID is in the
> thread
> > pool queue.
> >
> > Solution:
> >
> > Use a smart command dispatcher, i.e., keep the command in the
> queue
> > until we are sure that it will not wait for other commands.
> I've already
> > implemented some kind of executor service
> (ConditionalExecutorService,
> > in ISPN-2635 and ISPN-2636 branches, Total Order stuff) that
> only put
> > the Runnable (more precisely a new interface called
> ConditionalRunnable)
> > in the thread pool when it is ready to be processed. Creative
> guys, it
> > may need a better name :)
> >
> > The ConditionalRunnable has a new method (boolean isReady())
> that should
> > return true when the runnable should not block.
> >
> > Example how to apply this to ISPN-2808:
> >
> > Most of the commands awaits for a particular topology ID
> and/or for lock
> > acquisition.
> >
> >
> > Well, the original problem description was about the
> > DistributionInterceptor forwarding the command from the primary
> owner to
> > the backup owners and waiting for a response from them :)
> > The forwarding done by StateTransferInterceptor is also
> synchronous and
> > can block.
> >
> > It's true that you can't check how long a remote call will take
> > beforehand...
> >
> > In this way, the isReady() implementation can be something
> > like:
> >
> > isReady()
> > return commandTopologyId <= currentTopologyId && (for all
> keys; do if
> > !lock(key).tryLock(); return false; done)
> >
> >
> > Shouldn't you release the locks you managed to lock already if one of
> > the lock acquisitions failed?
>
> you are right. I have to release the locks for the pessimist mode. In
> optimistic mode, the locks are only released with the rollback command.
> >
> > Actually you may want to release the locks even if lock
> acquisition did
> > succeed... In non-tx mode, the locks are owned by the current
> thread, so
> > you can't lock a key on one thread and unlock it on another
> (though you
> > could skip the check completely in non-tx mode). And in transactional
> > mode, you could have a deadlock because 2 txs lock the same keys in a
> > different order.
> Non-tx caches cannot use this optimization. I've seen that problem early
> today when I start debugging it.
> >
> > Which leads me to a different point, how would you handle deadlocks?
> > With pessimistic mode, if tx1 holds lock k1 and wants to acquire
> k2, but
> > tx2 holds k2 and wants to acquire k1, will the LockCommands
> tx1:k2 and
> > tx2:k1 ever be scheduled? In general, can we make the time that a
> > Lock/PrepareCommand spends in the ConditionalExecutorService
> queue count
> > against lockAcquisitionTimeout?
>
> If I got a DeadLockException, I will send it back immediately and
> release all the locks.
>
>
> Hmmm, the default LockManager doesn't throw DeadlockDetectedExceptions,
> you have to enable deadlockDetection explicitly in the configuration.
> I'm guessing that's for performance reasons...
>
> It's true that you would need a new LockManager.tryLock method anyway,
> so you could implement deadlock detection in LockManagerImpl.tryLock.
> But probably the same performance considerations would apply.
oh... I though you was talking about the DeadlockDetectingLockManager.
In the normal, I am not able to detect if we have a deadlock or not.
Actually, I think I can avoid the deadlocks if I use what you suggest.
(and here I'm talking about tx caches).
when isReady() fails to acquire a lock I should release the locks
acquired. this way, the only reason why the isReady can fail to acquire
a lock is if the lock is acquired by a running command (i.e. a command
that is in the thread pool).
Do you think this may work?
>
>
> The lockAcquisitionTimeout is a problem that I haven't solved yet.
>
> >
> > With this, I believe we can keep the number of thread low and
> avoid the
> > thread deadlocks.
> >
> > Now, I have two possible implementations:
> >
> > 1) put a reference for StateTransferManager and/or
> LockManager in the
> > commands, and invoke the methods directly (a little dirty)
> >
> > 2) added new method in the CommandInterceptor like: boolean
> > preProcess<command>(Command, InvocationContext). each
> interceptor will
> > check if the command will block on it (returning false) or
> not (invoke
> > the next interceptor). For example, the
> StateTransferInterceptor returns
> > immediately false if the commandToplogyId is higher than the
> > currentTopologyId and the *LockingIntercerptor will return
> false if it
> > cannot acquire some lock.
> >
> > Any other suggestions? If I was not clear let me know.
> >
> >
> > TBH I would only check for the topology id before scheduling the
> > commands to the Infinispan thread pool, because checking the locks is
> > very complicated with all the configuration options that we have.
> >
> > This is how I was imagining solving the lock problem: The OOB threads
> > would execute directly the commands that do not acquire locks
> > (CommitCommand, TxCompletionNotificationCommand) directly and
> submit the
> > others to the ISPN thread pool; if the command's topology id was
> higher
> > than the current topology id, the OOB thread would just stick it in a
> > queue. A separate thread would read only the commands with the
> current
> > topology id from the queue, and process them just like an OOB thread.
> >
> > However... the CommitCommands may very well have a *lower*
> topology id
> > by the time they are executed, so the OOB/executor thread may
> well block
> > while waiting for the results of the forwarding RPC. Maybe we
> could work
> > around it by having StateTransferInterceptor submit a task with the
> > command forwarding only to the thread pool, but again it gets quite
> > complicated. So for the first phase I recommend handling only the
> > topology id problem.
> I have a similar implementation, but in your example, the CommitCommand
> will be scheduler to the thread pool only if the command topology ID is
> lower or equals than the current topology ID (like all topology affected
> commands)
>
> Currently, I only have 3 tests failing in
> org.infinispan.replication.FlagsReplicationTest (2 assertions and 1
> timeout) that I'm checking now...
> >
>
>
> I guess it's time to look at your code :)
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
More information about the infinispan-dev
mailing list