[jboss-jira] [JBoss JIRA] (JGRP-2050) S3_PING: Nodes never removed from .list file

Bela Ban (JIRA) issues at jboss.org
Mon Apr 18 06:59:00 EDT 2016


     [ https://issues.jboss.org/browse/JGRP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bela Ban resolved JGRP-2050.
----------------------------
    Fix Version/s: 4.0
       Resolution: Rejected


Use of {{logical_addr_cache_max_size}} fixes this, see my last comment on JGRP-1957. Please reopen if the issue still persists.

> S3_PING: Nodes never removed from .list file
> --------------------------------------------
>
>                 Key: JGRP-2050
>                 URL: https://issues.jboss.org/browse/JGRP-2050
>             Project: JGroups
>          Issue Type: Feature Request
>    Affects Versions: 3.6.8
>            Reporter: Mitchell Ackerman
>            Assignee: Bela Ban
>            Priority: Minor
>             Fix For: 4.0
>
>
> Unfortunately I seem to be running into the same or similar issue as JGRP-1957, even though I've updated to JGroups 3.6.8 and am using the settings you suggest in that (and other) posts.
> I'm running in AWS using S3_PING, JDK 1.8.0_66, JGroups 3.6.8, Tomcat 8.0.28.
> After terminating servers, mostly non-coordinators, I'm left with an S3 bucket with lots of zombies (there are only 2 active members), here is the file after the system has been stable for over an hour, and my JGroups config file. 
> Stepping through the code, I have confirmed that the scenario is the same as described in JGRP-1957.  upon a view change the new (correct) member list is written to S3, but then it is overwritten with all the old members. When the old members are added back to the logical_addr_cache they all have their removable field set to false, so that all subsequent evictions skip over these members and they are never removed.
> thanks, Mitchell
> ip-10-89-1-26-8729 72597f74-8a10-04fb-b397-22a3ed35da84 10.89.1.26:7800 F
> ip-10-89-0-18-38996 a5325932-e9cd-b281-b367-e2d86845aa75 10.89.0.18:7800 F
> ip-10-89-1-62-4868 ef73921a-2265-50a8-95d4-ebb8cae96944 10.89.1.62:7800 T
> ip-10-89-1-27-11915 5a0b4a26-b542-56f2-801a-420b5d7dbf34 10.89.1.27:7800 F
> ip-10-89-1-19-2542 c30c294d-69b0-b6ca-7010-bf89d1eb8f6f 10.89.1.19:7800 F
> ip-10-89-0-62-56914 fa2262c3-9097-7101-b225-24d8a52d905e 10.89.0.62:7800 F
> ip-10-89-0-28-32680 5d03124f-b061-becb-d793-6067bf0d7945 10.89.0.28:7800 F
> ip-10-89-1-26-51248 07cc18aa-381b-fb5d-0ad6-0612f7a5e9bb 10.89.1.26:7800 F
> ip-10-89-1-27-39755 1f9be940-2228-2181-ef80-4a83d319a2b3 10.89.1.27:7800 F
> ip-10-89-0-28-41919 4ab543f9-712e-645d-2f20-05304c98a23b 10.89.0.28:7800 F
> ip-10-89-1-27-10428 d5b0cb38-75e0-b3e1-c053-66b053b0fb05 10.89.1.27:7800 F
> my JGroups config file is:
> <?xml version="1.0" encoding="UTF-8"?>
> <config>
> <TCP
> bind_port="7800"
> port_range="30"
> recv_buf_size="20000000"
> send_buf_size="1000000"
> max_bundle_size="64000"
> max_bundle_timeout="1000"
> sock_conn_timeout="2000"
> enable_diagnostics="false"
> timer_type="new"
> timer.min_threads="4"
> timer.max_threads="10"
> timer.keep_alive_time="3000"
> timer.queue_max_size="1000"
> timer.wheel_size="200"
> timer.tick_time="50"
> thread_pool.enabled="true"
> thread_pool.min_threads="2"
> thread_pool.max_threads="100"
> thread_pool.keep_alive_time="60000"
> thread_pool.queue_enabled="true"
> thread_pool.queue_max_size="100000"
> thread_pool.rejection_policy="discard"
> oob_thread_pool.enabled="true"
> oob_thread_pool.min_threads="10"
> oob_thread_pool.max_threads="100"
> oob_thread_pool.keep_alive_time="60000"
> oob_thread_pool.queue_enabled="false"
> oob_thread_pool.queue_max_size="100"
> oob_thread_pool.rejection_policy="discard"
> logical_addr_cache_expiration="1000"
> logical_addr_cache_reaper_interval="10000"
> />
> <S3_PING location="bob-s3-ping-dev" remove_all_files_on_view_change="true" remove_old_coords_on_view_change="true"/>
> <MERGE3 max_interval="60000" min_interval="30000"/>
> <FD_SOCK/>
> <FD timeout="3000" max_tries="5"/>
> <VERIFY_SUSPECT timeout="2000"/>
> <pbcast.NAKACK use_mcast_xmit="false" retransmit_timeout="300,600,1200,2400,4800" discard_delivered_msgs="true"/>
> <UNICAST3/>
> <pbcast.STABLE stability_delay="1500" desired_avg_gossip="50000" max_bytes="2m"/>
> <pbcast.GMS print_local_addr="false" join_timeout="2500" max_bundling_time="50" view_bundling="true" max_join_attempts="$
> {jgroups_max_join_attempts}
> "/>
> <pbcast.STATE_TRANSFER />
> <!-- top -->
> <!-- /\ down -->
> <!-- \/ up -->
> </config>



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)


More information about the jboss-jira mailing list