[JBoss JIRA] (ISPN-11373) XSite backup commands should be sent from a blocking thread
by Pedro Ruivo (Jira)
[ https://issues.redhat.com/browse/ISPN-11373?page=com.atlassian.jira.plugi... ]
Pedro Ruivo updated ISPN-11373:
-------------------------------
Fix Version/s: (was: 11.0.0.Alpha2)
(was: 10.1.4.Final)
> XSite backup commands should be sent from a blocking thread
> -----------------------------------------------------------
>
> Key: ISPN-11373
> URL: https://issues.redhat.com/browse/ISPN-11373
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core
> Affects Versions: 9.4.18.Final, 10.1.2.Final, 11.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 9.4.19.Final
>
>
> XSite backup commands usually need more processing on the receiving site than local cluster commands do on the receiving node, which means there's a much higher chance of {{channel.send(message)}} to block.
> {{UFC}}, {{UFC_NB}}, {{MFC}} and {{MFC_NB}} all block when there are not enough credits.
> The _NB variants have an additional queue as a safety net, but that only delays the blocking: it's the same as increasing {{max_credits}} by {{max_queue_size}}, except with less work for {{UNICAST3}}/{{NAKACK2}}.
> {{TCP}} and {{UDP}} also block if their send buffer is full. Using a bundler like {{transfer-queue}} instead of the default {{no-bundler}} will only delay the blocking until the bundler's queue is also full.
> The biggest problem is when xsite backup commands are sent from a jgroups thread, and {{channel.send(message)}} blocks the thread. If the jgroups thread pool becomes full, it cannot process more messages, not even responses from the remote site.
> JGroups creates temporary threads to process internal messages when its thread pool is full, but not even that can help when the other nodes' thread pools are also full:
> {noformat}
> "jgroups-temp-thread-5728,_ma267mlvjdg015:dal_mcom_perf" #11443 prio=5 os_prio=0 tid=0x000000000906f800 nid=0x26cb waiting on condition [0x00007fb0b7b0a000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000005f3bce048> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
> at org.jgroups.protocols.TransferQueueBundler.send(TransferQueueBundler.java:97)
> at org.jgroups.protocols.TP.send(TP.java:1441)
> at org.jgroups.protocols.TP._send(TP.java:1195)
> at org.jgroups.protocols.TP.down(TP.java:1111)
> ...
> at org.jgroups.protocols.FlowControl.sendCredit(FlowControl.java:480)
> at org.jgroups.protocols.FlowControl.handleCreditRequest(FlowControl.java:469)
> at org.jgroups.protocols.FlowControl.handleUpEvent(FlowControl.java:379)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:350)
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 10 months
[JBoss JIRA] (ISPN-11367) getCache(name) should never use the default cache's configuration
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11367?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-11367:
--------------------------------
Status: Open (was: New)
> getCache(name) should never use the default cache's configuration
> -----------------------------------------------------------------
>
> Key: ISPN-11367
> URL: https://issues.redhat.com/browse/ISPN-11367
> Project: Infinispan
> Issue Type: Enhancement
> Components: Configuration, Core
> Affects Versions: 10.1.2.Final, 11.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 11.0.0.Alpha2
>
>
> {{DefaultCacheManager.getCache(name)}} will try to create the cache if it doesn't exist, and if a matching configuration doesn't exist either it uses the default cache's configuration.
> Since 9.0 named cache configurations no longer inherit from the default cache configuration, so using the default cache's configuration as a default configuration is surprising.
> {{DefaultCacheManager.getCache(name)}} should only create the cache if a matching configuration exists, otherwise it should throw an exception.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 10 months
[JBoss JIRA] (ISPN-11407) XSite backup commands should be sent from a blocking thread
by Pedro Ruivo (Jira)
Pedro Ruivo created ISPN-11407:
----------------------------------
Summary: XSite backup commands should be sent from a blocking thread
Key: ISPN-11407
URL: https://issues.redhat.com/browse/ISPN-11407
Project: Infinispan
Issue Type: Enhancement
Components: Core
Affects Versions: 9.4.18.Final, 10.1.2.Final, 11.0.0.Alpha1
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 9.4.19.Final
XSite backup commands usually need more processing on the receiving site than local cluster commands do on the receiving node, which means there's a much higher chance of {{channel.send(message)}} to block.
{{UFC}}, {{UFC_NB}}, {{MFC}} and {{MFC_NB}} all block when there are not enough credits.
The _NB variants have an additional queue as a safety net, but that only delays the blocking: it's the same as increasing {{max_credits}} by {{max_queue_size}}, except with less work for {{UNICAST3}}/{{NAKACK2}}.
{{TCP}} and {{UDP}} also block if their send buffer is full. Using a bundler like {{transfer-queue}} instead of the default {{no-bundler}} will only delay the blocking until the bundler's queue is also full.
The biggest problem is when xsite backup commands are sent from a jgroups thread, and {{channel.send(message)}} blocks the thread. If the jgroups thread pool becomes full, it cannot process more messages, not even responses from the remote site.
JGroups creates temporary threads to process internal messages when its thread pool is full, but not even that can help when the other nodes' thread pools are also full:
{noformat}
"jgroups-temp-thread-5728,_ma267mlvjdg015:dal_mcom_perf" #11443 prio=5 os_prio=0 tid=0x000000000906f800 nid=0x26cb waiting on condition [0x00007fb0b7b0a000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000005f3bce048> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
at org.jgroups.protocols.TransferQueueBundler.send(TransferQueueBundler.java:97)
at org.jgroups.protocols.TP.send(TP.java:1441)
at org.jgroups.protocols.TP._send(TP.java:1195)
at org.jgroups.protocols.TP.down(TP.java:1111)
...
at org.jgroups.protocols.FlowControl.sendCredit(FlowControl.java:480)
at org.jgroups.protocols.FlowControl.handleCreditRequest(FlowControl.java:469)
at org.jgroups.protocols.FlowControl.handleUpEvent(FlowControl.java:379)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:350)
{noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
4 years, 10 months