[JBoss JIRA] (ISPN-11428) Docs: JGroups Config Updates
by Pedro Ruivo (Jira)
[ https://issues.redhat.com/browse/ISPN-11428?page=com.atlassian.jira.plugi... ]
Pedro Ruivo updated ISPN-11428:
-------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 11.0.0.Dev04
Resolution: Done
> Docs: JGroups Config Updates
> ----------------------------
>
> Key: ISPN-11428
> URL: https://issues.redhat.com/browse/ISPN-11428
> Project: Infinispan
> Issue Type: Enhancement
> Components: Documentation
> Reporter: Donald Naro
> Assignee: Donald Naro
> Priority: Major
> Fix For: 11.0.0.Dev04
>
>
> From Wolf:
> ISPN docs
> https://infinispan.org/docs/dev/titles/configuring/configuring.html#clust...
> show information about improve performance and mention jgroups-default.xml ....
> but no idea where to find it
> name is confusing should be clarified
> there is no jgroups-defaults for the server it is default-configs/default-jgroups-<stack>.xml
> there is no infinispan-jgroups.xml as well in core.jar
> so the (i) box is just confusion
> 5. Setting up Cluster Transport
> 5.1.2 Default stack
> Confusing as the default-jgroups-[tcp|udp].xml show something more specific and more attributes.
> What is the purpose of this simplified stack? If the default xmls are used I would not include the example here, just point to the ispn-core.jar/default-configs which are listed in 5.1.1
> The (i) box is confusing as there is no heint where to find it.
> Also the ispn-core.jar does not include a jgroups-default.xml or infinspan-jgroups.xml. The embedded stuff seems to point to the same defaults as 5.1.1.
> If there are no real properties which are the default I would remove the box
> 5.2 inline stack
> The hint box “Use inheritance …” should point to 5.3 and here it should mention that this is a fully created stack with the complete configuration.
> 5.3 Adjusting
> For 2. You should mention that this can be “udp” or “tcp” used by default, not only a self created one
> Note that <VERIFY_SUSPECT> is replaced, which means all attributes are back to defaults.
> Example might include RELAY as it can be used without any position to be appended at the end. This is most useful for XSite replication (example is infnispan-xsite.xml)
> Explain that the stack.position is the name of any protocol.
> Not sure what happen if it is not found!
> 5.5.1 system properties default jgroups
> UDP
> Jgroups.udp.* → no udp !
> Missing
> Address → jgroups.bind.address,jgroups.udp.address defaults to SITE_LOCAL
> Port → jgroups.bind.port,jgroups.udp.port defaults to “0” automatically
> TCP
> Address can use jgroups.bind.address and the default is SITE_LOCAL not 127.0.0.1
> Port can use jgroups.bind.port
> Jgroups.udp.* → no udp !
> EC2, kubernetes
> Same here for address and port
> 6 Discovery
> Should we list FILE_PING as well as this is the simple base for discovery and already shared with customers?
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years
[JBoss JIRA] (ISPN-11519) Cache should not start if it cluster listener replication fails
by Dan Berindei (Jira)
Dan Berindei created ISPN-11519:
-----------------------------------
Summary: Cache should not start if it cluster listener replication fails
Key: ISPN-11519
URL: https://issues.redhat.com/browse/ISPN-11519
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 11.0.0.Dev03, 10.1.5.Final
Reporter: Dan Berindei
Assignee: Will Burns
Fix For: 11.0.0.Dev04
{{StateConsumerImpl.fetchClusterListeners}} catches any exceptions during the fetch and local installation of cluster listeners from other nodes, and only logs a warning message:
{noformat}
18:04:14,069 WARN (jgroups-5,Test-NodeD:[]) [StateConsumerImpl] ISPN000284: Problem encountered while installing cluster listener
{noformat}
If a cache starts without installing all the cluster listeners locally, the listeners will miss events for keys that end up with the joiner as the primary owner, which would be pretty hard to debug. We should instea fail fast, and prevent the cache from starting if the cluster listeners cannot be fetched and installed locally.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years
[JBoss JIRA] (ISPN-11519) Cache should not start if it cluster listener replication fails
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11519?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-11519:
--------------------------------
Status: Open (was: New)
> Cache should not start if it cluster listener replication fails
> ---------------------------------------------------------------
>
> Key: ISPN-11519
> URL: https://issues.redhat.com/browse/ISPN-11519
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 10.1.5.Final, 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Will Burns
> Priority: Major
> Fix For: 11.0.0.Dev04
>
>
> {{StateConsumerImpl.fetchClusterListeners}} catches any exceptions during the fetch and local installation of cluster listeners from other nodes, and only logs a warning message:
> {noformat}
> 18:04:14,069 WARN (jgroups-5,Test-NodeD:[]) [StateConsumerImpl] ISPN000284: Problem encountered while installing cluster listener
> {noformat}
> If a cache starts without installing all the cluster listeners locally, the listeners will miss events for keys that end up with the joiner as the primary owner, which would be pretty hard to debug. We should instea fail fast, and prevent the cache from starting if the cluster listeners cannot be fetched and installed locally.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years
[JBoss JIRA] (ISPN-11518) IndexingDuringStateTransferTest random failures
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11518?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-11518:
--------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/8093
> IndexingDuringStateTransferTest random failures
> -----------------------------------------------
>
> Key: ISPN-11518
> URL: https://issues.redhat.com/browse/ISPN-11518
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying, Test Suite
> Affects Versions: 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 11.0.0.Dev04
>
>
> {{IndexingDuringStateTransferTest.test}} blocks {{StateResponseCommand}} so that state transfer doesn't finish during the test. Since state transfer is now non-blocking, blocking the thread that's trying to send the {{StateResponseCommand}} also prevents it from sending the response to the {{StateTransferStartCommand}}.
> Because the {{StateTransferStartCommand}} is blocked, the requestor can't complete the transaction data future in {{StateTransferLockImpl}}, and the test can't proceed to unblock the {{StateResumeCommand}}.
> The {{RpcManager}} wrapper is supposed to send the command anyway, but it doesn't catch {{AssertionError}}, so the command isn't sent and the rebalance hangs.
> Eventually the put times out, and the next test also fails because it can't add a new manager:
> {noformat}
> java.lang.RuntimeException: java.util.concurrent.TimeoutException
> at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.test(IndexingDuringStateTransferTest.java:189)
> at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.testPut(IndexingDuringStateTransferTest.java:78)
> at org.infinispan.commons.test.TestNGLongTestsHook.run(TestNGLongTestsHook.java:24)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.util.concurrent.TimeoutException
> {noformat}
> {noformat}
> org.infinispan.manager.EmbeddedCacheManagerStartupException: org.infinispan.commons.CacheException: Initial state transfer timed out for cache defaultcache on IndexingDuringStateTransferTest-NodeF
> at org.infinispan.manager.DefaultCacheManager.internalStart(DefaultCacheManager.java:746)
> at org.infinispan.manager.DefaultCacheManager.start(DefaultCacheManager.java:712)
> at org.infinispan.test.MultipleCacheManagersTest.addClusterEnabledCacheManager(MultipleCacheManagersTest.java:268)
> at org.infinispan.test.MultipleCacheManagersTest.addClusterEnabledCacheManager(MultipleCacheManagersTest.java:232)
> at org.infinispan.test.MultipleCacheManagersTest.addClusterEnabledCacheManager(MultipleCacheManagersTest.java:225)
> at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.test(IndexingDuringStateTransferTest.java:144)
> at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.testPutIgnoreReturnValue(IndexingDuringStateTransferTest.java:82)
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years
[JBoss JIRA] (ISPN-11518) IndexingDuringStateTransferTest random failures
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11518?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-11518:
--------------------------------
Status: Open (was: New)
> IndexingDuringStateTransferTest random failures
> -----------------------------------------------
>
> Key: ISPN-11518
> URL: https://issues.redhat.com/browse/ISPN-11518
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying, Test Suite
> Affects Versions: 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Labels: testsuite_stability
> Fix For: 11.0.0.Dev04
>
>
> {{IndexingDuringStateTransferTest.test}} blocks {{StateResponseCommand}} so that state transfer doesn't finish during the test. Since state transfer is now non-blocking, blocking the thread that's trying to send the {{StateResponseCommand}} also prevents it from sending the response to the {{StateTransferStartCommand}}.
> Because the {{StateTransferStartCommand}} is blocked, the requestor can't complete the transaction data future in {{StateTransferLockImpl}}, and the test can't proceed to unblock the {{StateResumeCommand}}.
> The {{RpcManager}} wrapper is supposed to send the command anyway, but it doesn't catch {{AssertionError}}, so the command isn't sent and the rebalance hangs.
> Eventually the put times out, and the next test also fails because it can't add a new manager:
> {noformat}
> java.lang.RuntimeException: java.util.concurrent.TimeoutException
> at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.test(IndexingDuringStateTransferTest.java:189)
> at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.testPut(IndexingDuringStateTransferTest.java:78)
> at org.infinispan.commons.test.TestNGLongTestsHook.run(TestNGLongTestsHook.java:24)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.util.concurrent.TimeoutException
> {noformat}
> {noformat}
> org.infinispan.manager.EmbeddedCacheManagerStartupException: org.infinispan.commons.CacheException: Initial state transfer timed out for cache defaultcache on IndexingDuringStateTransferTest-NodeF
> at org.infinispan.manager.DefaultCacheManager.internalStart(DefaultCacheManager.java:746)
> at org.infinispan.manager.DefaultCacheManager.start(DefaultCacheManager.java:712)
> at org.infinispan.test.MultipleCacheManagersTest.addClusterEnabledCacheManager(MultipleCacheManagersTest.java:268)
> at org.infinispan.test.MultipleCacheManagersTest.addClusterEnabledCacheManager(MultipleCacheManagersTest.java:232)
> at org.infinispan.test.MultipleCacheManagersTest.addClusterEnabledCacheManager(MultipleCacheManagersTest.java:225)
> at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.test(IndexingDuringStateTransferTest.java:144)
> at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.testPutIgnoreReturnValue(IndexingDuringStateTransferTest.java:82)
> {noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years
[JBoss JIRA] (ISPN-11518) IndexingDuringStateTransferTest random failures
by Dan Berindei (Jira)
Dan Berindei created ISPN-11518:
-----------------------------------
Summary: IndexingDuringStateTransferTest random failures
Key: ISPN-11518
URL: https://issues.redhat.com/browse/ISPN-11518
Project: Infinispan
Issue Type: Bug
Components: Embedded Querying, Test Suite
Affects Versions: 11.0.0.Dev03
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 11.0.0.Dev04
{{IndexingDuringStateTransferTest.test}} blocks {{StateResponseCommand}} so that state transfer doesn't finish during the test. Since state transfer is now non-blocking, blocking the thread that's trying to send the {{StateResponseCommand}} also prevents it from sending the response to the {{StateTransferStartCommand}}.
Because the {{StateTransferStartCommand}} is blocked, the requestor can't complete the transaction data future in {{StateTransferLockImpl}}, and the test can't proceed to unblock the {{StateResumeCommand}}.
The {{RpcManager}} wrapper is supposed to send the command anyway, but it doesn't catch {{AssertionError}}, so the command isn't sent and the rebalance hangs.
Eventually the put times out, and the next test also fails because it can't add a new manager:
{noformat}
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.test(IndexingDuringStateTransferTest.java:189)
at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.testPut(IndexingDuringStateTransferTest.java:78)
at org.infinispan.commons.test.TestNGLongTestsHook.run(TestNGLongTestsHook.java:24)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.util.concurrent.TimeoutException
{noformat}
{noformat}
org.infinispan.manager.EmbeddedCacheManagerStartupException: org.infinispan.commons.CacheException: Initial state transfer timed out for cache defaultcache on IndexingDuringStateTransferTest-NodeF
at org.infinispan.manager.DefaultCacheManager.internalStart(DefaultCacheManager.java:746)
at org.infinispan.manager.DefaultCacheManager.start(DefaultCacheManager.java:712)
at org.infinispan.test.MultipleCacheManagersTest.addClusterEnabledCacheManager(MultipleCacheManagersTest.java:268)
at org.infinispan.test.MultipleCacheManagersTest.addClusterEnabledCacheManager(MultipleCacheManagersTest.java:232)
at org.infinispan.test.MultipleCacheManagersTest.addClusterEnabledCacheManager(MultipleCacheManagersTest.java:225)
at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.test(IndexingDuringStateTransferTest.java:144)
at org.infinispan.query.blackbox.IndexingDuringStateTransferTest.testPutIgnoreReturnValue(IndexingDuringStateTransferTest.java:82)
{noformat}
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years
[JBoss JIRA] (ISPN-11513) ComposedSegmentedLoadWriteStore should not iterate over segments in parallel
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11513?page=com.atlassian.jira.plugi... ]
Dan Berindei updated ISPN-11513:
--------------------------------
Fix Version/s: 10.1.6.Final
> ComposedSegmentedLoadWriteStore should not iterate over segments in parallel
> ----------------------------------------------------------------------------
>
> Key: ISPN-11513
> URL: https://issues.redhat.com/browse/ISPN-11513
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 10.1.5.Final, 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Will Burns
> Priority: Major
> Fix For: 11.0.0.Dev04, 11.0.0.Final, 10.1.6.Final
>
>
> {{ComposedSegmentedLoadWriteStore}} always parallelizes iterations, even when the consumer is a single thread doing a blocking iteration like {{cache.keySet().forEach(...)}}. This will use lots of blocking threads, and will crowd out single-key cache operations, which are more sensitive to latency (because there are usually multiple single-key operations for each user request).
> It would be interesting to extend the persistence SPI so that the store knows when the user requested parallel iteration, but in the meantime it's safer to iterate over the segments in sequence.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years
[JBoss JIRA] (ISPN-11514) CacheMgmtInterceptor getNumberOfEntries should include expired entries
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/ISPN-11514?page=com.atlassian.jira.plugi... ]
Dan Berindei commented on ISPN-11514:
-------------------------------------
Need to check the other statistics as well. E.g. {{ClusterCacheStatsImpl}} doesn't use {{CacheMgmtInterceptor.getNumberOfEntries()}}, it calls the {{cache.size()}} directly.
> CacheMgmtInterceptor getNumberOfEntries should include expired entries
> ----------------------------------------------------------------------
>
> Key: ISPN-11514
> URL: https://issues.redhat.com/browse/ISPN-11514
> Project: Infinispan
> Issue Type: Feature Request
> Components: API, Core
> Affects Versions: 11.0.0.Dev03
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 11.0.0.Final
>
>
> {{CacheMgmtInterceptor.getNumberOfEntries}} and {{CacheMgmtInterceptor.getNumberOfEntriesInMemory}} both require an iteration over the data container and/or stores in order to exclude expired entries.
> Since these are statistics, they don't have to be exact, so we can include expired entries until a read or the expiration reaper removes them from the cache. In fact, I would argue that it's more correct for {{CacheMgmtInterceptor.getNumberOfEntriesInMemory}} to include expired entries, because expired entries still use memory until they are removed.
> The most likely reason that the strategy for counting entries was changed with ISPN-7686 is that the {{AdvancedCache}} doesn't have a {{sizeIncludingExpired}} method. Since it's not possible to obtain the exact size of the cache with passivation enabled without iterating over the store to eliminate duplicates, I suggest instead to estimate {{CacheMgmtInterceptor.getNumberOfEntries}} as the number of entries in the data container + the number of entries in the store.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years