[jboss-jira] [JBoss JIRA] (JGRP-1957) S3_PING: Nodes never removed from .list file

Nick Sawadsky (JIRA) issues at jboss.org
Wed Sep 2 13:34:05 EDT 2015


    [ https://issues.jboss.org/browse/JGRP-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104693#comment-13104693 ] 

Nick Sawadsky commented on JGRP-1957:
-------------------------------------

Yes, I've tested this workaround and it seems to work:

- reduce the logical_addr_cache_expiration to 1 second
- reduce the logical_addr_cache_reaper_interval to 10 seconds
- increase the min_interval and max_interval for MERGE3 to 30 and 60 seconds, respectively
- set remove_all_files_on_view_change to true

With these settings, the expired nodes do seem to get removed from the file as expected. It would be nice to have a fix that did not require users to make these config changes, but I'm fine with pushing the issue to 3.6.6.


> S3_PING: Nodes never removed from .list file
> --------------------------------------------
>
>                 Key: JGRP-1957
>                 URL: https://issues.jboss.org/browse/JGRP-1957
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 3.6.4
>         Environment: JGroups client running on Mac OS X - Yosemite
> JDK 1.7.71
>            Reporter: Nick Sawadsky
>            Assignee: Bela Ban
>            Priority: Minor
>             Fix For: 3.6.5
>
>
> I'm not 100% sure, but it seems like there might be a defect here.
> I'm using TCP, S3_PING, and MERGE3. 
> I've set logical_addr_cache_max_size to 2 for testing purposes, although I don't think the value of this setting affects my test results.
> I start a single node, node A. Then I start a second node, node B.
> I then repeatedly shutdown and restart node B.
> Each time node B starts, a new row is added to the .list file stored in S3. 
> But even if I continue this process for 15 minutes, old rows are never removed from the .list file, so it continues to grow in size.
> I've read the docs and mailing list threads, so I'm aware that the list is not immediately updated as soon as a member leaves. But I was expecting that when a view change occurs, nodes no longer in the view would be marked for removal (line 2193 of TP.java) and then after the logical_addr_cache_expiration has been reached and the reaper kicks in, once a new node joins, the expired cache entries would be purged from the file.
> I dug in to the code a bit, and what seems to be happening is that the MERGE3 protocol periodically generates a FIND_MBRS event. S3_PING retrieves the membership from the .list file, which includes expired nodes. And then all of these members are re-added to the logical address cache (line 157 of S3_PING.java, line 533 of Discovery.java, line 2263 of TP.java).
> So expired nodes are continually re-added to the logical address cache, preventing them from ever being reaped.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


More information about the jboss-jira mailing list