[
https://issues.redhat.com/browse/JGRP-2474?page=com.atlassian.jira.plugin...
]
Mirko Streckenbach commented on JGRP-2474:
------------------------------------------
I just mentioned that in the last paragraph (no separate issue). I was unsure if this is
really an issue, it is just something I observed.
This is as following (slight modified version of the attached program with UUID
addresses).
1. First channel is created and becomes coordinator
2. Second channel is created and used connect()
3. viewAccepted is called for both with a view with both channnels
4. After a few desconds, disconnect() is called on the second channel calls disconnect()
5. viewAccepted for the first channel is called with a new view with only the first
channels
6. disconnect() returns
After that, the coordinator still tries to send messages to the second channel. I have
several messages in the log of the first channel:
FINER: i1:1609 --> i2:1709: resending(#2)
Mai 07, 2020 6:02:21 PM org.jgroups.protocols.TP down
FINER: i1:1609: sending msg to i2:1709, src=i1:1609, headers are GMS:
GmsHeader[LEAVE_RSP], UNICAST3: DATA, seqno=2, TP: [cluster=cluster]
Mai 07, 2020 6:02:21 PM org.jgroups.protocols.UNICAST3 retransmit
This goes on for some time.
With 4.1.2 this did not happen. After the disconect() of the second channel, there were no
further messages send. Starting with 4.1.3 these retransmits occur.
I've attached the modified program (JGR2.java) and the output (log2.txt)
Messages about dropped queued message when using IpAddressUUID
--------------------------------------------------------------
Key: JGRP-2474
URL:
https://issues.redhat.com/browse/JGRP-2474
Project: JGroups
Issue Type: Bug
Affects Versions: 4.2.3
Reporter: Mirko Streckenbach
Assignee: Bela Ban
Priority: Major
Attachments: JGR.java, JGR2.java, log-fail-1.txt, log2.txt
We upgraded from 4.0.14 to 4.1.8 and ever since then, we had some messages like
{code}
Apr 27, 2020 10:30:54 AM org.jgroups.protocols.UNICAST3 addQueuedMessages
WARNING: i2:1709: dropped queued message i1:1609#2 as its conn_id (0) did not match
(entry.conn_id=1)
{code}
when ever an application is restarted. Our setup is as follows (most due to network
restrictions):
* Fixed port numbers
* JDBC_PING
* We use IpAddressUUID in order to have a "readable" information in the
jgroupsping table
I could track this down to 4.1.2 / 4.1.3: 4.1.2 works as expected, from 4.1.3 I'm
seeing the effect observed above.
I attached a simple example that demonstrates the problem: starts two stacks, shuts down
the second (non-coordindator) and starts it again after a couple of seconds. With 4.1.2
this works as expected (no warnings), but 4.1.3 and more recent versions (including 4.2.3)
produce warnings. The exact behavior is not completely consistent: in most cases, starting
the second app again results in some timeouts and the second app becomes a coordinator
itself and a merge view is established later (log attached). In some cases it only creates
the warnings shown above (this is what we observe in our real application) and in some
cases everything works fine.
I don't have any warnings in the log if I don't set an AddressGenerator, but
I'd like to avoid this.
While running this on higher debug levels, I observed the following: 4.1.2 will not
require an
ACK for the LEAVE_RSP message. 4.1.3 will. The second app sends the ACK, but the
coordinator does not seem to receive or process it properly and retransmits the LEAVE_RSP
message again and again. This is independent of the AddressGenerator used,
--
This message was sent by Atlassian Jira
(v7.13.8#713008)