[jboss-jira] [JBoss JIRA] Created: (JGRP-1155) Socket Exceptions on Coordinator after adding start_port in FD_SOCK

Ken Michie (JIRA) jira-events at lists.jboss.org
Tue Feb 16 18:15:10 EST 2010


Socket Exceptions on Coordinator after adding start_port in FD_SOCK
-------------------------------------------------------------------

                 Key: JGRP-1155
                 URL: https://jira.jboss.org/jira/browse/JGRP-1155
             Project: JGroups
          Issue Type: Bug
    Affects Versions: 2.6.13, 2.6.10
         Environment: RHEL 5.2.
135.9.147.170 is coordinator, 135.9.147.158 is other member
Connection is over TCP
Jgroup Channel is over port 7802
FD_SOCK is over port 7803
iptables is running, which is why I need to specify a port; the random port selection gets blocked otherwise
            Reporter: Ken Michie
            Assignee: Bela Ban


I was able to reproduce it with Draw program on both 2.6.10.GA and 2.6.13.GA.  Here is my command:
java -cp .:log4j.properties:log4j.jar:jgroups-all.jar:commons-logging.jar org.jgroups.demos.Draw -props kenTcp.xml

log4j.properties was on WARN and above for org.jgroups.

I would tail the jgroups.log file and ONLY on the coordinator I would eventually see these messages printing every so often:

2010-02-16 15:49:43,144 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:50167 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Broken pipe
2010-02-16 15:49:43,144 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:50167 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Broken pipe
2010-02-16 15:50:07,624 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:38803 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Socket closed
2010-02-16 15:50:07,624 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:38803 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Socket closed
2010-02-16 15:50:33,608 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:44940 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Socket closed
2010-02-16 15:50:33,608 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:44940 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Socket closed
2010-02-16 15:50:33,611 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:55279 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Broken pipe
2010-02-16 15:50:33,611 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:55279 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Broken pipe
2010-02-16 15:50:35,115 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:42406 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Broken pipe
2010-02-16 15:50:35,115 [ConnectionTable.Connection.Sender local_addr=135.9.147.170:7802 [135.9.147.170:42406 - 135.9.147.158:7803],DrawGroupDemo,135.9.147.170:7802] ERROR org.jgroups.blocks.ConnectionTable  - failed sending data to 135.9.147.158:7803: java.net.SocketException: Broken pipe

You can see it is trying to use some different ephemeral port each time, but from "netstat -an | grep 780" (FD_SOCK is on 7803, and channel is on 7802), you can see it is using the wrong socket, where 158 is the other client:

tcp        0      0 ::ffff:135.9.147.170:7802   :::*                        LISTEN
tcp        0      0 ::ffff:135.9.147.170:7803   :::*                        LISTEN
tcp        0      0 ::ffff:135.9.147.170:7803   ::ffff:135.9.147.158:41549  ESTABLISHED
tcp        0      0 ::ffff:135.9.147.170:57478  ::ffff:135.9.147.158:7802   ESTABLISHED
tcp        0      0 ::ffff:135.9.147.170:46929  ::ffff:135.9.147.158:7803   ESTABLISHED

Protocol stack in XML (the TCPPING initial_hosts value is of the other member for the other JGroups member):
<config>
    <TCP start_port="7802"
         loopback="false"
         recv_buf_size="20000000"
         send_buf_size="640000"
         discard_incompatible_packets="false"
         max_bundle_size="128000"
         max_bundle_timeout="100"
         use_incoming_packet_handler="true"
         enable_bundling="true"
         use_send_queues="true"
         sock_conn_timeout="300"
         skip_suspected_members="true"

         use_concurrent_stack="true"

         thread_pool.enabled="true"
         thread_pool.min_threads="2"
         thread_pool.max_threads="10"
         thread_pool.keep_alive_time="5000"
         thread_pool.queue_enabled="false"
         thread_pool.queue_max_size="1000"
         thread_pool.rejection_policy="run"

         oob_thread_pool.enabled="true"
         oob_thread_pool.min_threads="2"
         oob_thread_pool.max_threads="10"
         oob_thread_pool.keep_alive_time="5000"
         oob_thread_pool.queue_enabled="false"
         oob_thread_pool.queue_max_size="1000"
         oob_thread_pool.rejection_policy="run"/>

    <TCPPING timeout="3000"
             initial_hosts="135.9.147.170[7800]}"
             port_range="4"/>
    <MERGE2 max_interval="100000"
              min_interval="20000"/>
    <FD_SOCK start_port="7803"/>
    <FD timeout="10000" max_tries="5" shun="true"/>
    <VERIFY_SUSPECT timeout="1500"  />
    <BARRIER />
    <pbcast.NAKACK
                   use_mcast_xmit="false" gc_lag="0"
                   retransmit_timeout="300,600,1200,2400,4800"
                   discard_delivered_msgs="true"/>
    <UNICAST timeout="300,600,1200" />
    <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
                   max_bytes="400000"/>
    <VIEW_SYNC avg_send_interval="60000"/>
    <pbcast.GMS print_local_addr="true" join_timeout="3000"
                shun="true"
                view_bundling="true"/>
    <FC max_credits="2000000"
        min_threshold="0.10"/>
    <FRAG2 frag_size="60000"  />
    <pbcast.STREAMING_STATE_TRANSFER/>
    <!-- <pbcast.STATE_TRANSFER/> -->
</config>




-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list