]
Galder Zamarreño updated ISPN-4585:
-----------------------------------
Fix Version/s: 7.1.0.Alpha1
(was: 7.0.0.Final)
Prioritize commands in the remote executor
------------------------------------------
Key: ISPN-4585
URL:
https://issues.jboss.org/browse/ISPN-4585
Project: Infinispan
Issue Type: Enhancement
Components: Core
Affects Versions: 7.0.0.Alpha5
Reporter: Dan Berindei
Fix For: 7.1.0.Alpha1
The remote executor currently has an unlimited queue of blocked task, but the underlying
executor cannot use a queue. With a queue, we wouldn't need to overflow remote
commands to the OOB threads, and the OOB threads would be free to process response
messages.
The problem is that {{ThreadPoolExecutor}} executes tasks in the order they are in the
queue. If a node has a remote executor thread pool of 100 threads and receives a
prepare(tx1, put(k, v1) comand, then 1000 prepare(tx_i, put(k, v_i)) commands, and finally
a commit(tx1) command, the commit(tx1) command will block until all but 99 of the the
prepare(tx_i, put(k, v_i)) commands have timed out.
I think we could help this by using a {{PriorityBlockingQueue}} for the underlying
executor, with commands ordered so that state transfer commands < commit/tx completion
notification < prepare/lock. The commit command would still have to wait for one of the
prepare commands currently running to time out, but it wouldn't have to wait for all
of them.
The current code, without a queue, would fill the remote executor and OOB thread pools,
and it would discard the commit message (along with most of the prepare commands). The
time it would take to process the commit successfully would depend on the timing of the
retransmitted messages.
Another possible improvement would be to keep track of the commands currently being
executed, and always keep some threads free for commands with higher priority. But I'm
not sure how easy it would be to do that on top of an injected {{ExecutorService}}.
I believe there is also a problem with
{{BlockingTaskAwareExecutorServiceImpl.checkForReadyTasks()}} after a topology change.
Commands with the new topology id are all unblocked by submitting them to the underlying
executor in FIFO order, on a single thread, so {{CallerRunsPolicy}} is not a valid
rejection policy here.