[JBoss JIRA] (ISPN-4070) Server plugin for RHQ needs extensions in management/monitoring
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-4070?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes updated ISPN-4070:
------------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Server plugin for RHQ needs extensions in management/monitoring
> ---------------------------------------------------------------
>
> Key: ISPN-4070
> URL: https://issues.jboss.org/browse/ISPN-4070
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JMX, reporting and management
> Affects Versions: 6.0.1.Final
> Reporter: Tomas Sykora
> Assignee: William Burns
> Labels: 630, 63gablocker
>
> InVM plugin for RHQ is generated automatically. However, the plugin for server is not and those are results of deeper analysis of RHQ ISPN server plugin (missing) possibilities.
> JCONSOLE is showing number of statistics, traits and operations which are not incorporated in RHQ-plugin (*server* plugin). Some are quite important (start and stop ops for cache for example) and some might be only additional, not necessary - but still pleasant, information about a cluster/cache.
> *Missing Cache traits:*
> cacheName
> version
> configurationAsProperties
> statisticsEnabled
> *Missing from Cache statistics:*
> averageRemoveTime
> *Missing cache operations:*
> START and STOP are missing!!
> synchronizeData, disconnectSource, recordKnownGlobalKeyset - rolling upgrades related ops
> *Missing cache manager traits:*
> definedCacheCount
> globalConfigurationAsProperties
> clusterMembers
> createdCacheCounts
> definedCacheNames
> clusterSize
> version
> runningCacheCount
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-3594) Implementations of persistence.ParallelIterationTest fail randomly on all environmets
by Pedro Ruivo (JIRA)
[ https://issues.jboss.org/browse/ISPN-3594?page=com.atlassian.jira.plugin.... ]
Pedro Ruivo updated ISPN-3594:
------------------------------
Status: Pull Request Sent (was: Coding In Progress)
Git Pull Request: https://github.com/infinispan/infinispan/pull/2719
> Implementations of persistence.ParallelIterationTest fail randomly on all environmets
> -------------------------------------------------------------------------------------
>
> Key: ISPN-3594
> URL: https://issues.jboss.org/browse/ISPN-3594
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Core, Loaders and Stores
> Affects Versions: 6.0.0.Beta1, 7.0.0.Alpha1, 7.0.0.Alpha2
> Environment: RHEL{5, 6} && {x86_64, x64} && JDK6 && {Windows 2012 64bit && JDK6}
> Reporter: Anna Manukyan
> Assignee: Pedro Ruivo
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 7.0.0.Alpha5
>
>
> The tests extending from org.infinispan.persistence.ParallelIterationTest fail randomly on RHEL machines for JDK6.
> The error messages are:
> JdbcBinaryStoreParallelIterationTest.testParallelIteration
> {code}
> java.lang.AssertionError: expected [5] but found [2]
> at org.testng.Assert.fail(Assert.java:89)
> at org.testng.Assert.failNotEquals(Assert.java:489)
> at org.testng.Assert.assertEquals(Assert.java:118)
> at org.testng.Assert.assertEquals(Assert.java:365)
> at org.testng.Assert.assertEquals(Assert.java:375)
> at org.infinispan.persistence.ParallelIterationTest.runIterationTest(ParallelIterationTest.java:117)
> at org.infinispan.persistence.ParallelIterationTest.testParallelIteration(ParallelIterationTest.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> SingleFileStoreParallelIterationTest.testParallelIteration
> {code}
> java.lang.AssertionError: expected [5] but found [4]
> at org.testng.Assert.fail(Assert.java:89)
> at org.testng.Assert.failNotEquals(Assert.java:489)
> at org.testng.Assert.assertEquals(Assert.java:118)
> at org.testng.Assert.assertEquals(Assert.java:365)
> at org.testng.Assert.assertEquals(Assert.java:375)
> at org.infinispan.persistence.ParallelIterationTest.runIterationTest(ParallelIterationTest.java:117)
> at org.infinispan.persistence.ParallelIterationTest.testParallelIteration(ParallelIterationTest.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> LevelDBParallelIterationTest.testParallelIteration (from TestSuite)
> {code}
> expected [5] but found [4]
> Stacktrace
> java.lang.AssertionError: expected [5] but found [4]
> at org.testng.Assert.fail(Assert.java:94)
> at org.testng.Assert.failNotEquals(Assert.java:494)
> at org.testng.Assert.assertEquals(Assert.java:123)
> at org.testng.Assert.assertEquals(Assert.java:370)
> at org.testng.Assert.assertEquals(Assert.java:380)
> at org.infinispan.persistence.ParallelIterationTest.runIterationTest(ParallelIterationTest.java:117)
> at org.infinispan.persistence.ParallelIterationTest.testParallelIteration(ParallelIterationTest.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> at java.lang.reflect.Method.invoke(Method.java:619)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:273)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1170)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:640)
> at java.lang.Thread.run(Thread.java:853)
> {code}
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4525) Avoid contention on statistics collection in JPACacheStore
by Sanne Grinovero (JIRA)
Sanne Grinovero created ISPN-4525:
-------------------------------------
Summary: Avoid contention on statistics collection in JPACacheStore
Key: ISPN-4525
URL: https://issues.jboss.org/browse/ISPN-4525
Project: Infinispan
Issue Type: Enhancement
Security Level: Public (Everyone can see)
Components: Loaders and Stores
Reporter: Sanne Grinovero
Assignee: Sanne Grinovero
Fix For: 7.0.0.Alpha5
When testing on many cores the current approach in Stats becomes a bottleneck. This is easily improved by using the already backported LongAdder utility from Java8, even better we should provide an option to disable these stats.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4484) Outbound transfers can be cancelled by old CANCEL_STATE_TRANSFER command
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4484?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4484:
-----------------------------------------------
Alan Field <afield(a)redhat.com> changed the Status of [bug 1116969|https://bugzilla.redhat.com/show_bug.cgi?id=1116969] from ON_QA to VERIFIED
> Outbound transfers can be cancelled by old CANCEL_STATE_TRANSFER command
> ------------------------------------------------------------------------
>
> Key: ISPN-4484
> URL: https://issues.jboss.org/browse/ISPN-4484
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Core, State Transfer
> Affects Versions: 6.0.2.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.0.0.Alpha5
>
>
> This appeared during the 32-nodes elasticity test in the Hyperion environment.
> Just as apex947 left, it started a rebalance, which apex948 dutifully cancelled as it became the new coordinator. apex949 had already requested segments from apex959, so it sent a StateRequestCommand(CANCEL_STATE_TRANSFER) asynchronously to apex959. Then apex948 started a new rebalance, and apex949 asked apex959 for the same segments. When apex959 finally received the cancel request, it didn't check the topology id and it incorrectly cancelled the outbound transfer to apex949.
> The solution would be to verify the topology id in the CANCEL_STATE_TRANSFER command before cancelling the transfer. I also think we can avoid sending the cancel command completely in this case, and only send it as we are about to stop.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4426) Transaction replayed but not committed
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4426?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4426:
-----------------------------------------------
Alan Field <afield(a)redhat.com> changed the Status of [bug 1111644|https://bugzilla.redhat.com/show_bug.cgi?id=1111644] from ON_QA to VERIFIED
> Transaction replayed but not committed
> --------------------------------------
>
> Key: ISPN-4426
> URL: https://issues.jboss.org/browse/ISPN-4426
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: State Transfer
> Affects Versions: 7.0.0.Alpha4
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
> Labels: 63gablocker
> Fix For: 7.0.0.Alpha5
>
>
> Dist TX cache, node C is joining. In previous topology, entry is owned by A (primary) and B (backup). In new topology, primary ownership is transferred to C, B stays backup.
> 1. TX is prepared in old topology and is being committed, when topology changes
> 2. on C (the new owner), the TX info is received and later even the old entry
> 3. C receives the CommitCommand, therefore, it correctly replays the PrepareCommand.
> 4. When the entries are about to be committed, in TxInterceptor the transaction is found to be already completed as it has lower TxID.
> Result: the transaction is not being executed and stale data stay on the node (with my algortihm it eventually led to complete entry loss).
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4490) Members can miss the rebalance cancellation on coordinator change
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4490?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4490:
-----------------------------------------------
Alan Field <afield(a)redhat.com> changed the Status of [bug 1117948|https://bugzilla.redhat.com/show_bug.cgi?id=1117948] from ON_QA to VERIFIED
> Members can miss the rebalance cancellation on coordinator change
> -----------------------------------------------------------------
>
> Key: ISPN-4490
> URL: https://issues.jboss.org/browse/ISPN-4490
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Core, State Transfer
> Affects Versions: 7.0.0.Alpha4
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 7.0.0.Alpha5
>
>
> The new coordinator sends first a CH_UPDATE command to cancel the existing rebalance, and then a REBALANCE_START command to start a new rebalance. But the CH_UPDATE command is sent asynchronously, so it's possible for some members to receive it after the REBALANCE_START command.
> If that happens, that node will assume that it will receive the segments it requested for the previous rebalance. But with the ISPN-4484 fix, the provider node cancels the outbound transfer tasks when receiving a CH_UPDATE without a pendingCH, so the state requestor will never receive its segments.
> Even without the ISPN-4484 fix this is a problem, although less obvious. Between the provider node receiving the CH_UPDATE and the REBALANCE_START commands, it won't have the requestor in its write CH, so the requestor can miss transactions.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4154) Cancelled segment transfer causes future entry transfer to be ignored
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4154?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4154:
-----------------------------------------------
Alan Field <afield(a)redhat.com> changed the Status of [bug 1116963|https://bugzilla.redhat.com/show_bug.cgi?id=1116963] from ON_QA to VERIFIED
> Cancelled segment transfer causes future entry transfer to be ignored
> ---------------------------------------------------------------------
>
> Key: ISPN-4154
> URL: https://issues.jboss.org/browse/ISPN-4154
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: State Transfer
> Affects Versions: 7.0.0.Alpha1
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 7.0.0.Alpha5
>
>
> Distributed transactional cache.
> 1) Coordinator is gracefully leaving the cluster, sends a REBALANCE_START with topologyId 14, ST begins.
> 2) Node receives chunk from segment X, writes entry K=V to the container.
> 3) New coordinator jumps in with CH_UPDATE topology 16
> 4) Node receives CANCEL_STATE_TRANSFER and cancels transfer of segment X, invalidating K. In CommitManager, this operation is tracked and DiscardPolicy is set to DISCARD_STATE_TRANSFER for key K.
> 5) New coordinator starts rebalance with topology 17
> 6) Node starts new ST for segment X
> 7) Node receives the X: K=V, but in CommitManager it finds out that the policy is set to DISCARD_STATE_TRANSFER and ignores this update.
> Result: entry value is lost on some node.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4480) Messages sent to leavers can clog the JGroups bundler thread
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4480?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4480:
-----------------------------------------------
Alan Field <afield(a)redhat.com> changed the Status of [bug 1116965|https://bugzilla.redhat.com/show_bug.cgi?id=1116965] from ON_QA to VERIFIED
> Messages sent to leavers can clog the JGroups bundler thread
> ------------------------------------------------------------
>
> Key: ISPN-4480
> URL: https://issues.jboss.org/browse/ISPN-4480
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Core
> Affects Versions: 6.0.2.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
>
> In a stress test that repeatedly kills nodes while performing read/write operations, the TransferQueueBundler thread seems to spend a lot of time waiting for physical addresses:
> {noformat}
> 06:40:10,316 WARN [org.radargun.utils.Utils] (pool-5-thread-1) Stack for thread TransferQueueBundler,default,apex953-14666:
> java.lang.Thread.sleep(Native Method)
> org.jgroups.util.Util.sleep(Util.java:1504)
> org.jgroups.util.Util.sleepRandom(Util.java:1574)
> org.jgroups.protocols.TP.sendToSingleMember(TP.java:1685)
> org.jgroups.protocols.TP.doSend(TP.java:1670)
> org.jgroups.protocols.TP$TransferQueueBundler.sendBundledMessages(TP.java:2476)
> org.jgroups.protocols.TP$TransferQueueBundler.sendMessages(TP.java:2392)
> org.jgroups.protocols.TP$TransferQueueBundler.run(TP.java:2383)
> java.lang.Thread.run(Thread.java:744)
> {noformat}
> There are 2 bugs related to this already fixed in JGroups 3.5.0.Beta2+: JGRP-1814, JGRP-1815
> There is also a special case where the physical address could be removed from the cache too soon, exacerbating the effect of JGRP-1815: JGRP-1858
> We can work around the problem by changing the JGroups configuration:
> * TP.logical_addr_cache_expiration=86400000
> ** Only expire addresses after 1 day
> * TP.physical_addr_max_fetch_attempts=1
> ** Sleep for only 20ms waiting for the physical address (default 3 - 1500ms)
> * UNICAST3_conn_close_timeout=10000
> ** Drop the pending messages to leavers sooner
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months