On 07/30/2014 01:59 PM, Dan Berindei wrote:



On Wed, Jul 30, 2014 at 12:22 PM, Radim Vansa <rvansa@redhat.com> wrote:

Investigation:
------------
When I looked at UNICAST3, I saw a lot of missing messages on the
receive side and unacked messages on the send side. This caused me to
look into the (mainly OOB) thread pools and - voila - maxed out !

I learned from Pedro that the Infinispan internal thread pool (with a
default of 32 threads) can be configured, so I increased it to 300 and
increased the OOB pools as well.

This mitigated the problem somewhat, but when I increased the requester
threads to 100, I had the same problem again. Apparently, the Infinispan
internal thread pool uses a rejection policy of "run" and thus uses the
JGroups (OOB) thread when exhausted.

We can't use another rejection policy in the remote executor because the message won't be re-delivered by JGroups, and we can't use a queue either.

Can't we just send response "Node is busy" and cancel the operation? (at least in cases where this is possible - we can't do that safely for CommitCommand, but usually it could be doable, right?) And what's the problem with queues, besides that these can grow out of memory?

No commit commands here, the cache is not transactional :)

Sure, but any change to OOB -> remote thread pool would likely affect both non-tx and tx.


If the remote thread pool gets full on a backup node, there is no way to safely cancel the operation - other backup owners may have already applied the write. And even with numOwners=2, there are multiple backup owners during state transfer.

I was thinking about delaying the write until backup responds, but you're right, with 2 and more backups the situation is not that easy.


We do throw an OutdatedTopologyException on the backups and retry the operation when the topology changes, we could do something similar when the remote executor thread pool is full. But 1) we have trouble preserving consistency when we retry, so we'd rather do it only when we really have to, and 2) repeated retries can be costly, as the primary needs to re-acquire the lock.

The problem with queues is that commands are executed in the order they are in the queue. If a node has a remote executor thread pool of 100 threads and receives a prepare(tx1, put(k, v1) comand, then 1000 prepare(tx_i, put(k, v_i)) commands, and finally a commit(tx1) command, the commit(tx1) command will block until all but 99 of the the prepare(tx_i, put(k, v_i)) commands have timed out.

Makes sense


I have some thoughts on improving that independently of Pedro's work on locking [1], and I've just written that up as ISPN-4585 [2]

[1] https://issues.jboss.org/browse/ISPN-2849
[2] https://issues.jboss.org/browse/ISPN-4585

 

ISPN-2849 sounds a lot like the state machine-based interceptor stack, I am looking forward to that! (although it's the music of far future - ISPN 9, 10?)

Thanks for those answers, Dan. I should realize most of that myself, but I don't have the capacity to hold all the wisdom about NBST algorithms online in my brain :) I hope some day I could catch a student looking for diploma thesis willing to model at least the basic Infinispan algorithms and formally verify that it's (in)correct ;-).

Radim



Radim

-- 
Radim Vansa <rvansa@redhat.com>
JBoss DataGrid QA

_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev



_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


-- 
Radim Vansa <rvansa@redhat.com>
JBoss DataGrid QA