<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Mar 18, 2013 at 6:09 PM, Pedro Ruivo <span dir="ltr">&lt;<a href="mailto:pedro@infinispan.org" target="_blank">pedro@infinispan.org</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi all,<br>

<br>

To solve ISPN-2808 (avoid blocking JGroups threads in order to allow to<br>

deliver the request responses), I&#39;ve created another thread pool to move<br>

the possible blocking commands (i.e. the commands that may block until<br>

some state is achieved).<br>

<br>

Problem description:<br>

<br>

With this solution, the new thread pool should be large in order to be<br>

able to handle the remote commands without deadlocks. The problem is<br>

that all the threads can be block to process the command that may<br>

unblock other commands.<br>

<br>

Example: a bunch of commands are blocked waiting for a new topology ID<br>

and the command that will increment the topology ID is in the thread<br>

pool queue.<br>

<br>

Solution:<br>

<br>

Use a smart command dispatcher, i.e., keep the command in the queue<br>

until we are sure that it will not wait for other commands. I&#39;ve already<br>

implemented some kind of executor service (ConditionalExecutorService,<br>

in ISPN-2635 and ISPN-2636 branches, Total Order stuff) that only put<br>

the Runnable (more precisely a new interface called ConditionalRunnable)<br>

in the thread pool when it is ready to be processed. Creative guys, it<br>

may need a better name :)<br>

<br>

The ConditionalRunnable has a new method (boolean isReady()) that should<br>

return true when the runnable should not block.<br>

<br>

Example how to apply this to ISPN-2808:<br>

<br>

Most of the commands awaits for a particular topology ID and/or for lock<br>

acquisition. </blockquote><div><br></div><div>Well, the original problem description was about the DistributionInterceptor forwarding the command from the primary owner to the backup owners and waiting for a response from them :)<br>


The forwarding done by StateTransferInterceptor is also synchronous and can block. <br><br>It&#39;s true that you can&#39;t check how long a remote call will take beforehand...<br> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">


In this way, the isReady() implementation can be something<br>

like:<br>

<br>

isReady()<br>

  return commandTopologyId &lt;= currentTopologyId &amp;&amp; (for all keys; do if<br>

!lock(key).tryLock(); return false; done)<br>

<br></blockquote><div><br></div><div>Shouldn&#39;t you release the locks you managed to lock already if one of the lock acquisitions failed?<br><br></div><div>Actually you may want to release the locks even if lock acquisition did succeed... In non-tx mode, the locks are owned by the current thread, so you can&#39;t lock a key on one thread and unlock it on another (though you could skip the check completely in non-tx mode). And in transactional mode, you could have a deadlock because 2 txs lock the same keys in a different order.<br>


<br>Which leads me to a different point, how would you 

handle deadlocks? With pessimistic mode, if tx1 holds lock k1 and wants to 

acquire k2, but tx2 holds k2 and wants to acquire k1, will the LockCommands tx1:k2 and tx2:k1 ever be scheduled? In general, can we make the time that a Lock/PrepareCommand spends in the ConditionalExecutorService queue count against lockAcquisitionTimeout?<br>


<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

With this, I believe we can keep the number of thread low and avoid the<br>

thread deadlocks.<br>

<br>

Now, I have two possible implementations:<br>

<br>

1) put a reference for StateTransferManager and/or LockManager in the<br>

commands, and invoke the methods directly (a little dirty)<br>

<br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

2) added new method in the CommandInterceptor like: boolean<br>

preProcess&lt;command&gt;(Command, InvocationContext). each interceptor will<br>

check if the command will block on it (returning false) or not (invoke<br>

the next interceptor). For example, the StateTransferInterceptor returns<br>

immediately false if the commandToplogyId is higher than the<br>

currentTopologyId and the *LockingIntercerptor will return false if it<br>

cannot acquire some lock.<br>

<br>

Any other suggestions? If I was not clear let me know.<br></blockquote><div><br></div><div>TBH I would only check for the topology id before scheduling the commands to the Infinispan thread pool, because checking the locks is very complicated with all the configuration options that we have.<br>


</div><br></div><div class="gmail_quote">This is how I was imagining solving the lock problem: The OOB threads would execute directly the commands that do not acquire locks (CommitCommand, 

TxCompletionNotificationCommand) directly and submit the others to the ISPN thread pool; if the command&#39;s topology id was higher than the current topology id, the OOB thread would just stick it in a queue. A separate thread would read only the commands with the current topology id from the queue, and process them just like an OOB thread.<br>


<br></div><div class="gmail_quote">However... the CommitCommands may very well have a *lower* topology id by the time they are executed, so the OOB/executor thread may well block while waiting for the results of the forwarding RPC. Maybe we could work around it by having StateTransferInterceptor submit a task with the command forwarding only to the thread pool, but again it gets quite complicated. So for the first phase I recommend handling only the topology id problem.<br>


<br><br></div></div></div>