<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 07/30/2014 01:59 PM, Dan Berindei
wrote:<br>
</div>
<blockquote
cite="mid:CA+nfvwT0aafotWWDmhv0wRJMM=DB8TtiYtZWfa63VBdv3we3Lg@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Wed, Jul 30, 2014 at 12:22 PM,
Radim Vansa <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:rvansa@redhat.com" target="_blank">rvansa@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div class=""> <br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Investigation:<br>
------------<br>
When I looked at UNICAST3, I saw a lot of
missing messages on the<br>
receive side and unacked messages on the
send side. This caused me to<br>
look into the (mainly OOB) thread pools and
- voila - maxed out !<br>
<br>
I learned from Pedro that the Infinispan
internal thread pool (with a<br>
default of 32 threads) can be configured, so
I increased it to 300 and<br>
increased the OOB pools as well.<br>
<br>
This mitigated the problem somewhat, but
when I increased the requester<br>
threads to 100, I had the same problem
again. Apparently, the Infinispan<br>
internal thread pool uses a rejection policy
of "run" and thus uses the<br>
JGroups (OOB) thread when exhausted.<br>
</blockquote>
<div><br>
</div>
<div>We can't use another rejection policy in
the remote executor because the message
won't be re-delivered by JGroups, and we
can't use a queue either.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
Can't we just send response "Node is busy" and cancel
the operation? (at least in cases where this is possible
- we can't do that safely for CommitCommand, but usually
it could be doable, right?) And what's the problem with
queues, besides that these can grow out of memory?</div>
</blockquote>
<div><br>
</div>
<div>No commit commands here, the cache is not transactional
:)</div>
</div>
</div>
</div>
</blockquote>
<br>
Sure, but any change to OOB -> remote thread pool would likely
affect both non-tx and tx.<br>
<br>
<blockquote
cite="mid:CA+nfvwT0aafotWWDmhv0wRJMM=DB8TtiYtZWfa63VBdv3we3Lg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div>If the remote thread pool gets full on a backup node,
there is no way to safely cancel the operation - other
backup owners may have already applied the write. And even
with numOwners=2, there are multiple backup owners during
state transfer.</div>
</div>
</div>
</div>
</blockquote>
<br>
I was thinking about delaying the write until backup responds, but
you're right, with 2 and more backups the situation is not that
easy.<br>
<br>
<blockquote
cite="mid:CA+nfvwT0aafotWWDmhv0wRJMM=DB8TtiYtZWfa63VBdv3we3Lg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div>We do throw an OutdatedTopologyException on the backups
and retry the operation when the topology changes, we
could do something similar when the remote executor thread
pool is full. But 1) we have trouble preserving
consistency when we retry, so we'd rather do it only when
we really have to, and 2) repeated retries can be costly,
as the primary needs to re-acquire the lock.</div>
<div><br>
</div>
<div>The problem with queues is that commands are executed
in the order they are in the queue. If a node has a remote
executor thread pool of 100 threads and receives a
prepare(tx1, put(k, v1) comand, then 1000 prepare(tx_i,
put(k, v_i)) commands, and finally a commit(tx1) command,
the commit(tx1) command will block until all but 99 of the
the prepare(tx_i, put(k, v_i)) commands have timed out.</div>
</div>
</div>
</div>
</blockquote>
<br>
Makes sense<br>
<br>
<blockquote
cite="mid:CA+nfvwT0aafotWWDmhv0wRJMM=DB8TtiYtZWfa63VBdv3we3Lg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div>I have some thoughts on improving that independently of
Pedro's work on locking [1], and I've just written that up
as ISPN-4585 [2]</div>
<div><br>
</div>
<div>[1] <a moz-do-not-send="true"
href="https://issues.jboss.org/browse/ISPN-2849">https://issues.jboss.org/browse/ISPN-2849</a></div>
<div>[2] <a moz-do-not-send="true"
href="https://issues.jboss.org/browse/ISPN-4585">https://issues.jboss.org/browse/ISPN-4585</a></div>
<div><br>
</div>
<div> </div>
</div>
</div>
</div>
</blockquote>
<br>
ISPN-2849 sounds a lot like the state machine-based interceptor
stack, I am looking forward to that! (although it's the music of far
future - ISPN 9, 10?)<br>
<br>
Thanks for those answers, Dan. I should realize most of that myself,
but I don't have the capacity to hold all the wisdom about NBST
algorithms online in my brain :) I hope some day I could catch a
student looking for diploma thesis willing to model at least the
basic Infinispan algorithms and formally verify that it's
(in)correct ;-).<br>
<br>
Radim<br>
<br>
<blockquote
cite="mid:CA+nfvwT0aafotWWDmhv0wRJMM=DB8TtiYtZWfa63VBdv3we3Lg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span class=""><font
color="#888888"><br>
<br>
Radim<br>
<br>
<pre cols="72">--
Radim Vansa <a moz-do-not-send="true" href="mailto:rvansa@redhat.com" target="_blank"><rvansa@redhat.com></a>
JBoss DataGrid QA
</pre>
</font></span></div>
<br>
_______________________________________________<br>
infinispan-dev mailing list<br>
<a moz-do-not-send="true"
href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a><br>
<a moz-do-not-send="true"
href="https://lists.jboss.org/mailman/listinfo/infinispan-dev"
target="_blank">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a><br>
</blockquote>
</div>
<br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
infinispan-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:infinispan-dev@lists.jboss.org">infinispan-dev@lists.jboss.org</a>
<a class="moz-txt-link-freetext" href="https://lists.jboss.org/mailman/listinfo/infinispan-dev">https://lists.jboss.org/mailman/listinfo/infinispan-dev</a></pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
Radim Vansa <a class="moz-txt-link-rfc2396E" href="mailto:rvansa@redhat.com"><rvansa@redhat.com></a>
JBoss DataGrid QA
</pre>
</body>
</html>