[infinispan-issues] [JBoss JIRA] (ISPN-2376) KeyAffinityServiceImpl.getKeyForAddress() seems to loop forever when DefaultConsistentHash is created for the non-local node owner

Dan Berindei (JIRA) jira-events at lists.jboss.org
Wed Oct 10 10:20:03 EDT 2012


    [ https://issues.jboss.org/browse/ISPN-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12725359#comment-12725359 ] 

Dan Berindei commented on ISPN-2376:
------------------------------------

Quoting myself from the dev list:

I missed it earlier, but you have configured numSegments = 1, which means there will only be one primary owner for all the keys in the cache. Since node-1 is still alive, it will remain the primary owner for the single segment, and node-0 will never become primary owner for any key. (It will be a backup owner, but KeyAffinityService only looks for primary owners.)

We probably need to add a check to the KeyAffinityServiceImpl constructor and abort if there are not enough segments for each node to be a primary owner (i.e. numSegments < numNodes). In the meantime, I think you can just increase numSegments in your configuration so that it's greater than the number of nodes and it should work.
                
> KeyAffinityServiceImpl.getKeyForAddress() seems to loop forever when DefaultConsistentHash is created for the non-local node owner
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: ISPN-2376
>                 URL: https://issues.jboss.org/browse/ISPN-2376
>             Project: Infinispan
>          Issue Type: Feature Request
>          Components: Core API
>    Affects Versions: 5.2.0.Beta1
>            Reporter: Scott Marlow
>            Assignee: Mircea Markus
>             Fix For: 5.2.0.CR1
>
>
> I instrumented KeyAffinityServiceImpl and DefaultConsistentHash to show why KeyAffinityServiceImpl is looping forever when running the AS7 clustered tests with some recent changes that aren't committed yet.  We are hoping to get through this failure so we can get clustered tests running again more completely on our continuous test server (lightning).
> We have two nodes running in the AS cluster, node-0/web and node-1/web.  
> In my recent test run, I stopped the test after it was stuck for a while.  Below is some of the instrumented logging output.
> {quote}
> KeyAffinityServiceImpl interestedInAddress() check, for address: node-1/web, filter.contains(address) returns false, filter contents [node-0/web]
> .
> .
> .
> KeyAffinityServiceImpl.getKeyForAddress() loop # 1455775 will loop again since result is null, queue [], address node-0/web
> {quote}
> We are using address "node-1/web" because that is passed into the DefaultConsistentHash constructor segmentOwners parameter (element zero). 
> Later, address=node-1/web is the primary owner of the consistent hash (hash=DefaultConsistentHash{numSegments=1, numOwners=2, members=[node-1/web, node-0/web], segmentOwners={0: 0 1}).
> I'm still collecting information and want to get a little more.  
> Let me know if there is anything that you would like to see.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the infinispan-issues mailing list