[JBoss JIRA] (ISPN-4585) Prioritize commands in the remote executor
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-4585?page=com.atlassian.jira.plugin.... ]
Dan Berindei resolved ISPN-4585.
--------------------------------
Resolution: Won't Do
No longer necessary since lock acquisition doesn't block a remote-executor thread.
> Prioritize commands in the remote executor
> ------------------------------------------
>
> Key: ISPN-4585
> URL: https://issues.jboss.org/browse/ISPN-4585
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core
> Affects Versions: 7.0.0.Alpha5
> Reporter: Dan Berindei
> Priority: Major
>
> The remote executor currently has an unlimited queue of blocked task, but the underlying executor cannot use a queue. With a queue, we wouldn't need to overflow remote commands to the OOB threads, and the OOB threads would be free to process response messages.
> The problem is that {{ThreadPoolExecutor}} executes tasks in the order they are in the queue. If a node has a remote executor thread pool of 100 threads and receives a prepare(tx1, put(k, v1) comand, then 1000 prepare(tx_i, put(k, v_i)) commands, and finally a commit(tx1) command, the commit(tx1) command will block until all but 99 of the the prepare(tx_i, put(k, v_i)) commands have timed out.
> I think we could help this by using a {{PriorityBlockingQueue}} for the underlying executor, with commands ordered so that state transfer commands < commit/tx completion notification < prepare/lock. The commit command would still have to wait for one of the prepare commands currently running to time out, but it wouldn't have to wait for all of them.
> The current code, without a queue, would fill the remote executor and OOB thread pools, and it would discard the commit message (along with most of the prepare commands). The time it would take to process the commit successfully would depend on the timing of the retransmitted messages.
> Another possible improvement would be to keep track of the commands currently being executed, and always keep some threads free for commands with higher priority. But I'm not sure how easy it would be to do that on top of an injected {{ExecutorService}}.
> I believe there is also a problem with {{BlockingTaskAwareExecutorServiceImpl.checkForReadyTasks()}} after a topology change. Commands with the new topology id are all unblocked by submitting them to the underlying executor in FIFO order, on a single thread, so {{CallerRunsPolicy}} is not a valid rejection policy here.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 3 months
[JBoss JIRA] (ISPN-1796) Out-of-memory adding a lot of elements in cache with AsyncStore
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-1796?page=com.atlassian.jira.plugin.... ]
Dan Berindei resolved ISPN-1796.
--------------------------------
Fix Version/s: 9.2.0.Final
Resolution: Done
Fixed with ISPN-2293
> Out-of-memory adding a lot of elements in cache with AsyncStore
> ---------------------------------------------------------------
>
> Key: ISPN-1796
> URL: https://issues.jboss.org/browse/ISPN-1796
> Project: Infinispan
> Issue Type: Task
> Components: Loaders and Stores, Transactions
> Affects Versions: 5.1.0.CR3
> Environment: We plan to use Infinispan as a large distributed write-behind cache of terabytes of data, with a little fraction cached in RAM, so OOM is real threat for us.
> Reporter: Andrew Pushkin
> Priority: Major
> Labels: executor, threads
> Fix For: 9.2.0.Final
>
> Original Estimate: 3 days
> Remaining Estimate: 3 days
>
> OOM occure on peaks of putting objects in cache configured to use AsyncStore.
> See Steps to Reproduce.
>
> Profiling shows that the gc path is through AsyncStore.state field.
> The AsyncStore.executor initialized to ThreadPoolExecutor with DiscardPolicy to silently discard tasks if the queue is full, which delays async processing of entries in *state* map, which continues to grow.
> Suggested solution.
> Instead of DiscardPolicy use customized behavior, which is to estimate accumulated state size and (probably comparing it with modificationQueueSize) decide to discard or to block until it is processed.
> The downside of suggested solution is the necessity to lock to estimate state size every time the task is rejected. Possibly it can be alleviated by increasing workingQueue size, so that it survive peaks without rejection.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 3 months