[infinispan-dev] [Fwd: [Fwd: Sometimes TCP responses not getting through on localhost]]

Galder Zamarreno galder.zamarreno at redhat.com
Wed Jul 8 12:32:52 EDT 2009


Zipped version :)

-------- Original Message --------
Subject: [Fwd: Sometimes TCP responses not getting through on localhost]
Date: Wed, 08 Jul 2009 17:52:14 +0200
From: Galder Zamarreno <galder.zamarreno at redhat.com>
To: infinispan-dev at lists.jboss.org <infinispan-dev at lists.jboss.org>

As a FYI: Not sure if you've seen similar but when running DIST tests, I
randomly get stoppages like the one below. Just sending it to the rest
of the team in case they have more info.

Vladimir is planning to have a look to them at some point.

-------- Original Message --------
Subject: Sometimes TCP responses not getting through on localhost
Date: Tue, 07 Jul 2009 09:10:26 +0200
From: Galder Zamarreno <galder.zamarreno at redhat.com>
To: Vladimir Blagojevic <vladimir.blagojevic at jboss.com>

Hi Vladimir,

I'm running one of the Infinispan distribution tests locally and from
time to time, I'm seeing some stoppage receiving a response from one of
the nodes. See attached infinispan and jgroups TRACE log.

More importantly, focus on request id=1246949535122.

First, 47089 sends a clustered get request:
2009-07-07 08:52:15,122 5920  TRACE [org.jgroups.protocols.TCP] (main:)
sending msg to null, src=localhost.localdomain-47089, headers are
MsgDisp: [Header: name=MsgDisp, type=REQ, id=1246949535122,
rsp_expected=true], dest_mbrs=[localhost.localdomain-44649,
localhost.localdomain-15543], NAKACK: [MSG, seqno=1], TCP:
[channel_name=Infinispan-Cluster]

44649 deals with it and replies:
infinispan.log:2749:2009-07-07 08:52:15,124 5922  TRACE
[org.jgroups.blocks.RequestCorrelator]
(Incoming-2,Infinispan-Cluster,localhost.localdomain-44649:) calling
(org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher)
with request 1246949535122
infinispan.log:2761:2009-07-07 08:52:15,125 5923  TRACE
[org.jgroups.blocks.RequestCorrelator]
(Incoming-2,Infinispan-Cluster,localhost.localdomain-44649:) sending rsp
for 1246949535122 to localhost.localdomain-47089
infinispan.log:2765:2009-07-07 08:52:15,126 5924  TRACE
[org.jgroups.protocols.TCP]
(Incoming-2,Infinispan-Cluster,localhost.localdomain-44649:) sending msg
to localhost.localdomain-47089, src=localhost.localdomain-44649, headers
are MsgDisp: [Header: name=MsgDisp, type=RSP, id=1246949535122,
rsp_expected=false], UNICAST: [UNICAST: DATA, seqno=1,
conn_id=1246949535125, first], TCP: [channel_name=Infinispan-Cluster]

15543 deals with it too and replies:
infinispan.log:2788:2009-07-07 08:52:15,129 5927  TRACE
[org.jgroups.blocks.RequestCorrelator]
(Incoming-1,Infinispan-Cluster,localhost.localdomain-15543:) calling
(org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher)
with request 1246949535122
infinispan.log:2800:2009-07-07 08:52:15,129 5927  TRACE
[org.jgroups.blocks.RequestCorrelator]
(Incoming-1,Infinispan-Cluster,localhost.localdomain-15543:) sending rsp
for 1246949535122 to localhost.localdomain-47089
infinispan.log:2804:2009-07-07 08:52:15,130 5928  TRACE
[org.jgroups.protocols.TCP]
(Incoming-1,Infinispan-Cluster,localhost.localdomain-15543:) sending msg
to localhost.localdomain-47089, src=localhost.localdomain-15543, headers
are MsgDisp: [Header: name=MsgDisp, type=RSP, id=1246949535122,
rsp_expected=false], UNICAST: [UNICAST: DATA, seqno=1,
conn_id=1246949535130, first], TCP: [channel_name=Infinispan-Cluster]

47089 notes receiving it from 44649:
infinispan.log:2806:2009-07-07 08:52:15,130 5928  TRACE
[org.jgroups.protocols.TCP]
(OOB-1,Infinispan-Cluster,localhost.localdomain-47089:) message is [dst:
localhost.localdomain-47089, src: localhost.localdomain-15543 (3
headers), size=14 bytes, flags=OOB], headers are MsgDisp: [Header:
name=MsgDisp, type=RSP, id=1246949535122, rsp_expected=false], UNICAST:
[UNICAST: DATA, seqno=1, conn_id=1246949535130, first], TCP:
[channel_name=Infinispan-Cluster]

But after 15 seconds, 15543 has not been received:
2009-07-07 08:52:30,132 20930 TRACE [org.jgroups.blocks.GroupRequest]
(main:) timed out waiting for responses
2009-07-07 08:52:30,133 20931 TRACE [org.jgroups.blocks.GroupRequest]
(main:) call did not execute correctly, request is [req_id=1246949535122
caller=localhost.localdomain-47089
entries:
localhost.localdomain-15543: sender=localhost.localdomain-15543,
retval=null, received=false, suspected=false
localhost.localdomain-44649: sender=localhost.localdomain-44649,
retval=SuccessfulResponse, received=true, suspected=false

Any idea what could be causing this? This is happening randomly, not in
all runs. Is this related to that SingletonCacheStore stuff we saw last
time around?

Cheers,
-- 
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache


-- 
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache


-- 
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
-------------- next part --------------
A non-text attachment was scrubbed...
Name: infinispan.log.zip
Type: application/zip
Size: 75435 bytes
Desc: not available
Url : http://lists.jboss.org/pipermail/infinispan-dev/attachments/20090708/04c8e5e8/attachment-0001.zip 


More information about the infinispan-dev mailing list