[JBoss JIRA] (ISPN-3360) Tests from org.infinispan.query.blackbox.LocalCacheFSDirectoryTest fail on Windows environment
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-3360?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-3360:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
Integrated in master. Thanks!
> Tests from org.infinispan.query.blackbox.LocalCacheFSDirectoryTest fail on Windows environment
> ----------------------------------------------------------------------------------------------
>
> Key: ISPN-3360
> URL: https://issues.jboss.org/browse/ISPN-3360
> Project: Infinispan
> Issue Type: Bug
> Components: Querying
> Affects Versions: 6.0.0.Alpha1
> Environment: w2k8r2_x86_64 OracleJDK1.7
> Reporter: Vitalii Chepeliuk
> Assignee: Sanne Grinovero
> Labels: testsuite_stability
> Fix For: 6.0.0.Alpha3
>
> Attachments: LocalCacheFSDirectoryTest.log.zip
>
>
> When building infinispan on windows machines there are test failures in org.infinispan.query.blackbox.LocalCacheFSDirectoryTest with assertion error
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testLazyIteratorWithOffset
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testMaxResults
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testModified
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testMultipleResults
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testRemoved
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSearchKeyTransformer
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSearchManagerWithInstantiation
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSetFilter
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSetSort
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSimple
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testTypeFiltering
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testUpdated
> Add link to jenkins job https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 2 months
[JBoss JIRA] (ISPN-3417) Typo in EquivalentConcurrentHashMapV8
by Galder Zamarreño (JIRA)
Galder Zamarreño created ISPN-3417:
--------------------------------------
Summary: Typo in EquivalentConcurrentHashMapV8
Key: ISPN-3417
URL: https://issues.jboss.org/browse/ISPN-3417
Project: Infinispan
Issue Type: Bug
Reporter: Galder Zamarreño
Assignee: Mircea Markus
Priority: Minor
Fix For: 6.0.0.Beta1, 6.0.0.Final
Instead of:
{code}else if ((s | WAITER) = 0){code}
Should be:
{code}else if ((s & WAITER) = 0){code}
See discussion thread.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 2 months
[JBoss JIRA] (ISPN-3360) Tests from org.infinispan.query.blackbox.LocalCacheFSDirectoryTest fail on Windows environment
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-3360?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero commented on ISPN-3360:
---------------------------------------
[~vchepeli] thanks for the logs! via those I might have identified the problem, but I have no Windows machine to double-check. Maybe you could verify the pull?
> Tests from org.infinispan.query.blackbox.LocalCacheFSDirectoryTest fail on Windows environment
> ----------------------------------------------------------------------------------------------
>
> Key: ISPN-3360
> URL: https://issues.jboss.org/browse/ISPN-3360
> Project: Infinispan
> Issue Type: Bug
> Components: Querying
> Affects Versions: 6.0.0.Alpha1
> Environment: w2k8r2_x86_64 OracleJDK1.7
> Reporter: Vitalii Chepeliuk
> Assignee: Sanne Grinovero
> Labels: testsuite_stability
> Fix For: 6.0.0.Alpha3
>
> Attachments: LocalCacheFSDirectoryTest.log.zip
>
>
> When building infinispan on windows machines there are test failures in org.infinispan.query.blackbox.LocalCacheFSDirectoryTest with assertion error
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testLazyIteratorWithOffset
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testMaxResults
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testModified
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testMultipleResults
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testRemoved
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSearchKeyTransformer
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSearchManagerWithInstantiation
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSetFilter
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSetSort
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testSimple
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testTypeFiltering
> org.infinispan.query.blackbox.LocalCacheFSDirectoryTest.testUpdated
> Add link to jenkins job https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/FUNC/job/e...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 2 months
[JBoss JIRA] (ISPN-3357) Insufficient owners with putIfAbsent during rebalance
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-3357?page=com.atlassian.jira.plugin.... ]
Dan Berindei edited comment on ISPN-3357 at 8/15/13 2:57 PM:
-------------------------------------------------------------
It makes sense that this would happen during leave as well, as a leave is handled by first updating the CH to remove the leaver, and then adding a new owner just like during joins. The only difference is that the owners in the pending CH are almost always a superset of the owners in the current CH, so the write CH is the same as the pending CH and the final CH update doesn't actually change anything.
The write commands already have an {{ignorePreviousValue}} flag that is used in tx caches to skip the remote get and the value check (ISPN-3235). We can use the same flag in non-tx caches to make the command non-conditional on backup nodes.
There is a complication, though: say there is a state transfer in progress, the owners of k in the current CH are [A, B], and the owners in the pending CH are [C, B] (so the write CH owners are [A, B, C]). If A initiates a putIfAbsent(k, v) operation and B rejects the command because the topology changed (so the owners are now [C, B]), but C writes the value, then A will retry the command, C will do the previous value check, and the operation will fail - leaving B without an update.
So we have to ignore the previous value on C, but only if it's the same as the one we're trying to set. Otherwise we could overwrite a value set by another operation, and we wouldn't know whether to return {{null}} or the value existing on C. That means that the values need to implement {{equals()}} for this to work - but this shouldn't be a problem with HotRod clients.
One more possible source of inconsistencies is that if the originator notices a topology change at the end of the operation (in StateTransferInterceptor), it forwards the command again to all the owners. This happens without a lock, so it could overlap with another operation and leave a different value on each node. I plan to replace this with another topology in EntryWrappingInterceptor, just before applying the update, to make sure that if state transfer missed the update, then the command will be retried.
Finally, there is another case where we break consistency that I wanted to mention: if the primary owner dies, and there are multiple backup owners (either because numOwners >= 3 or because there is a join in progress), only one of the backups may execute the command. The originator would report a failure, but the cache would still be inconsistent, and retrying the putIfAbsent call wouldn't fix it. The inconsistency can also appear if a command is rejected only on some of the backup owners, and the originator dies before retrying. [~tkimura], I know you mentioned that you ignored the keys for which ISPN reported an error in your tests, I just wanted to make sure that this is acceptable.
was (Author: dan.berindei):
It makes sense that this would happen during leave as well, as a leave is handled by first updating the CH to remove the leaver, and then adding a new owner just like during joins. The only difference is that the owners in the pending CH are almost always a superset of the owners in the current CH, so the write CH is the same as the pending CH and the final CH update doesn't actually change anything.
The write commands already have an {{ignorePreviousValue}} flag that is used in tx caches to skip the remote get and the value check (ISPN-3235). We can use the same flag in non-tx caches to make the command non-conditional on backup nodes.
There is a complication, though: say there is a state transfer in progress, the owners of k in the current CH are [A, B], and the owners in the pending CH are [C, B] (so the write CH owners are [A, B, C]). If A initiates a putIfAbsent(k, v) operation and C rejects the command because the topology changed (so the owners are now [C, B]), but B writes the value, then A will retry the command, C will do the previous value check, and the operation will fail - leaving B without an update.
So we have to ignore the previous value on C, but only if it's the same as the one we're trying to set. Otherwise we could overwrite a value set by another operation, and we wouldn't know whether to return {{null}} or the value existing on C. That means that the values need to implement {{equals()}} for this to work - but this shouldn't be a problem with HotRod clients.
One more possible source of inconsistencies is that if the originator notices a topology change at the end of the operation (in StateTransferInterceptor), it forwards the command again to all the owners. This happens without a lock, so it could overlap with another operation and leave a different value on each node. I plan to replace this with another topology in EntryWrappingInterceptor, just before applying the update, to make sure that if state transfer missed the update, then the command will be retried.
Finally, there is another case where we break consistency that I wanted to mention: if the primary owner dies, and there are multiple backup owners (either because numOwners >= 3 or because there is a join in progress), only one of the backups may execute the command. The originator would report a failure, but the cache would still be inconsistent, and retrying the putIfAbsent call wouldn't fix it. The inconsistency can also appear if a command is rejected only on some of the backup owners, and the originator dies before retrying. [~tkimura], I know you mentioned that you ignored the keys for which ISPN reported an error in your tests, I just wanted to make sure that this is acceptable.
> Insufficient owners with putIfAbsent during rebalance
> -----------------------------------------------------
>
> Key: ISPN-3357
> URL: https://issues.jboss.org/browse/ISPN-3357
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Attachments: 7c29bccb.log, ISPN-3357-full-logs-leave.zip
>
>
> Here is test scenario:
> * DIST numOwners=2, start with 3 nodes cluster then join 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 20074
> * node2: 19888
> * node3: 20114
> * node4: 18885
> Total is 78961, 1039 entries are missing. No error on HotRod client side so 80000 entries should be there.
> Let's take a look at example missing entry, hash(thread01key151) = 7c29bccb.
> Current CH: owners(7c29bccb) are [node1, node2]
> Pending CH: owners(7c29bccb) are [node1, node2, node4]
> Balanced CH: owners(7c29bccb) are [node1, node4]
> The events sequence is:
> * hotrod -> node1
> * node1 -> node2, node4
> * node2 committed entry
> * node4 performed clustered get before write, got a value from node2 and will not commit the entry because this node thinks it's not changed/created
> * node1 committed entry
> * node2 invalidates the entry because it's no longer an owner
> Result owners(7c29bccb) are only node1 and node4 is missing. This entry may be completely lost by further rebalances when node4 is donor for this segment.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 2 months