[JBoss JIRA] (ISPN-3367) org.infinispan.statetransfer.ClusterTopologyManagerTest.testClusterRecoveryWithRebalance fails randomly
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-3367?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-3367:
-----------------------------------------------
Vitalii Chepeliuk <vchepeli(a)redhat.com> made a comment on [bug 988333|https://bugzilla.redhat.com/show_bug.cgi?id=988333]
Description of problem:
See Description of ISPN-3367
> org.infinispan.statetransfer.ClusterTopologyManagerTest.testClusterRecoveryWithRebalance fails randomly
> -------------------------------------------------------------------------------------------------------
>
> Key: ISPN-3367
> URL: https://issues.jboss.org/browse/ISPN-3367
> Project: Infinispan
> Issue Type: Bug
> Components: Core API
> Affects Versions: 6.0.0.Alpha1
> Environment: {w2k8r2 OracleJDK1.7}, { RHEL6_x86_64, OracleJDK1.7 and OpenJDK1.7}
> Reporter: Vitalii Chepeliuk
> Assignee: Mircea Markus
> Labels: testsuite_stability
>
> Error Message
> Thread already timed out waiting for event merge
> Stacktrace
> java.lang.IllegalStateException: Thread already timed out waiting for event merge
> at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:131)
> at org.infinispan.test.fwk.CheckPoint.triggerForever(CheckPoint.java:120)
> at org.infinispan.statetransfer.ClusterTopologyManagerTest.testClusterRecoveryWithRebalance(ClusterTopologyManagerTest.java:280)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> Add links to jenkins jobs
> windows, OracleJDK1.7>>>
> https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
> https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
> Linux, OracleJDK1.7>>>
> https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
> Linux, OpenJDK1.7>>>
> https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3367) org.infinispan.statetransfer.ClusterTopologyManagerTest.testClusterRecoveryWithRebalance fails randomly
by Vitalii Chepeliuk (JIRA)
Vitalii Chepeliuk created ISPN-3367:
---------------------------------------
Summary: org.infinispan.statetransfer.ClusterTopologyManagerTest.testClusterRecoveryWithRebalance fails randomly
Key: ISPN-3367
URL: https://issues.jboss.org/browse/ISPN-3367
Project: Infinispan
Issue Type: Bug
Components: Core API
Affects Versions: 6.0.0.Alpha1
Environment: {w2k8r2 OracleJDK1.7}, { RHEL6_x86_64, OracleJDK1.7 and OpenJDK1.7}
Reporter: Vitalii Chepeliuk
Assignee: Mircea Markus
Error Message
Thread already timed out waiting for event merge
Stacktrace
java.lang.IllegalStateException: Thread already timed out waiting for event merge
at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:131)
at org.infinispan.test.fwk.CheckPoint.triggerForever(CheckPoint.java:120)
at org.infinispan.statetransfer.ClusterTopologyManagerTest.testClusterRecoveryWithRebalance(ClusterTopologyManagerTest.java:280)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.TestRunner.privateRun(TestRunner.java:767)
at org.testng.TestRunner.run(TestRunner.java:617)
at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Add links to jenkins jobs
windows, OracleJDK1.7>>>
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
Linux, OracleJDK1.7>>>
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
Linux, OpenJDK1.7>>>
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3318) Migrate data from one cache store to another
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-3318?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño commented on ISPN-3318:
----------------------------------------
{quote}We need to keep the old file cache store around for the backward compatibility (JDG requirement).{quote}
Mircea, Randall's suggestion is compatible with what you're saying. The idea is that anyone that would use the FCS with Infinispan 6.0+, if FCS would be able to detect old FCS data structure, it could migrate it on the fly to the new single file based structure. That is backwards compatible and keeps the old file cache store code around, but transforms its inner details.
{quote}We'd also need a migration tool for migrating other cache stores (e.g. LevelDB cache store from certain users) so +1 for the migration tool.{quote}
Indeed, that might be needed, but for the specific case of the new FCS upgrade, you might not even need that, which is even better. If people want to migrate data from cache store X to cache store Y on a whim, then yes, you need a migration tool.
Disclaimer: I have not investigated yet the feasibility of Randall's approach.
> Migrate data from one cache store to another
> --------------------------------------------
>
> Key: ISPN-3318
> URL: https://issues.jboss.org/browse/ISPN-3318
> Project: Infinispan
> Issue Type: Task
> Components: Loaders and Stores
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Fix For: 6.0.0.Final
>
>
> Find a generic way to transfer data from one cache store to another, which could involve different Infinispan versions. This is handy to migrate file cache store based users to single file cache store (ISPN-2806).
> Ideally, this should be added as a recipe for rolling upgrades.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3357) Insufficient owners with putIfAbsent during node join rebalance
by Takayoshi Kimura (JIRA)
[ https://issues.jboss.org/browse/ISPN-3357?page=com.atlassian.jira.plugin.... ]
Takayoshi Kimura updated ISPN-3357:
-----------------------------------
Priority: Critical (was: Major)
> Insufficient owners with putIfAbsent during node join rebalance
> ---------------------------------------------------------------
>
> Key: ISPN-3357
> URL: https://issues.jboss.org/browse/ISPN-3357
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Attachments: 7c29bccb.log
>
>
> Here is test scenario:
> * DIST numOwners=2, start with 3 nodes cluster then join 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 20074
> * node2: 19888
> * node3: 20114
> * node4: 18885
> Total is 78961, 1039 entries are missing. No error on HotRod client side so 80000 entries should be there.
> Let's take a look at example missing entry, hash(thread01key151) = 7c29bccb.
> Current CH: owners(7c29bccb) are [node1, node2]
> Pending CH: owners(7c29bccb) are [node1, node2, node4]
> Balanced CH: owners(7c29bccb) are [node1, node4]
> The events sequence is:
> * hotrod -> node1
> * node1 -> node2, node4
> * node2 committed entry
> * node4 performed clustered get before write, got a value from node2 and will not commit the entry because this node thinks it's not changed/created
> * node1 committed entry
> * node2 invalidates the entry because it's no longer an owner
> Result owners(7c29bccb) are only node1 and node4 is missing. This entry may be completely lost by further rebalances when node4 is donor for this segment.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Takayoshi Kimura (JIRA)
[ https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin.... ]
Takayoshi Kimura updated ISPN-3366:
-----------------------------------
Priority: Critical (was: Major)
> Data loss when entry forwarding to primary owner and primary owner shutdown
> ---------------------------------------------------------------------------
>
> Key: ISPN-3366
> URL: https://issues.jboss.org/browse/ISPN-3366
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Attachments: ISPN-3366-logs.zip
>
>
> Looks like a problem in entry forwarding.
> Here is test scenario:
> * DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 26608
> * node2: 26622
> * node3: 26746
> * node4: 0
> Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
> Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
> Current CH: owners(574ff563) are [node4, node1]
> The events sequence is:
> * hotrod -> node1
> * node1 forwarding it to primary owner node4
> * node4 doesn't process the forwarded entry, shutdown
> Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Takayoshi Kimura (JIRA)
[ https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin.... ]
Takayoshi Kimura updated ISPN-3366:
-----------------------------------
Attachment: ISPN-3366-logs.zip
Attached logs for entry 574ff563, node1-HotRodServerWorker-15 thread log (forwarder), and full node4 (forwardee) logs.
> Data loss when entry forwarding to primary owner and primary owner shutdown
> ---------------------------------------------------------------------------
>
> Key: ISPN-3366
> URL: https://issues.jboss.org/browse/ISPN-3366
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Attachments: ISPN-3366-logs.zip
>
>
> Looks like a problem in entry forwarding.
> Here is test scenario:
> * DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 26608
> * node2: 26622
> * node3: 26746
> * node4: 0
> Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
> Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
> Current CH: owners(574ff563) are [node4, node1]
> The events sequence is:
> * hotrod -> node1
> * node1 forwarding it to primary owner node4
> * node4 doesn't process the forwarded entry, shutdown
> Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Takayoshi Kimura (JIRA)
Takayoshi Kimura created ISPN-3366:
--------------------------------------
Summary: Data loss when entry forwarding to primary owner and primary owner shutdown
Key: ISPN-3366
URL: https://issues.jboss.org/browse/ISPN-3366
Project: Infinispan
Issue Type: Bug
Components: Distributed Cache
Affects Versions: 6.0.0.Alpha1, 5.2.4.Final
Reporter: Takayoshi Kimura
Assignee: Dan Berindei
Looks like a problem in entry forwarding.
Here is test scenario:
* DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
* HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
After the test run, the numberOfEntries on each node are:
* node1: 26608
* node2: 26622
* node3: 26746
* node4: 0
Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
Current CH: owners(574ff563) are [node4, node1]
The events sequence is:
* hotrod -> node1
* node1 forwarding it to primary owner node4
* node4 doesn't process the forwarded entry, shutdown
Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months
[JBoss JIRA] (ISPN-2896) org.infinispan.lucene.DirectoryOnMultipleCachesTest.verifyIntendedLockCachesUsage fails randomly
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-2896?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-2896:
-----------------------------------------------
Vitalii Chepeliuk <vchepeli(a)redhat.com> changed the Status of [bug 918450|https://bugzilla.redhat.com/show_bug.cgi?id=918450] from ON_QA to VERIFIED
> org.infinispan.lucene.DirectoryOnMultipleCachesTest.verifyIntendedLockCachesUsage fails randomly
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-2896
> URL: https://issues.jboss.org/browse/ISPN-2896
> Project: Infinispan
> Issue Type: Bug
> Components: Lucene Directory
> Affects Versions: 5.2.2.Final
> Reporter: Anna Manukyan
> Assignee: Sanne Grinovero
> Labels: stable_embedded_query
> Fix For: 5.3.0.CR1
>
>
> org.infinispan.lucene.DirectoryOnMultipleCachesTest.verifyIntendedLockCachesUsage fails randomly. The error message is:
> {code}
> java.lang.AssertionError
> at org.infinispan.lucene.DirectoryOnMultipleCachesTest.verifyIntendedLockCachesUsage(DirectoryOnMultipleCachesTest.java:96)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:715)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:907)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1237)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
> at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
> at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
> at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months