Re: [infinispan-dev] Missing elements when using JProfiler...

Wednesday, 10 April 2013

A possible explanation for this:

Say we're not finding message 17 in [4 | 15 | 25]: 4 is the lowest 
message we *garbage collected*, 15 the highest we *delivered* and 25 the 
highest we *received* so far.

When different threads sent messages 15 - 18, they could have sent them 
in the order 15 -> 18 -> 16 -> 17. (Messages are only ordered at the 
receiver).

If, *before* 17 was added to the sender's retransmission table, the 
retransmit task at the receiver kicked in, then message #17 would not be 
found in the sender's retransmission table. A few microseconds later, 
#17 would be added and therefore retransmission would pass, although the 
receiver is likely *not* to ask for retransmission of #17 anymore as it 
probably received the message by now.

This is *not* incorrect, but I mitigated it in 3.3.x by dividing message 
gaps at the receiver into 2 groups: old and new, which is something like 
a generational garbage collector, where the most recent missing messages 
are not retransmitted for the first time, only when they 'survived' one 
retransmission. In other words, with 3.3.0.x, you should see far fewer 
of these warnings !

On 4/9/13 5:00 PM, Alan Field wrote:
...
 Hey Bela,

 A couple of weeks ago, I was trying to run the client stress test comparison between JDG
and Coherence under JProfiler. These test runs were not successful, because one of the
nodes in the cluster would always crash. However, I was also seeing missing element log
messages from JGroups, like this:
(https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-61-radargun-jdg-...)

 16:48:09,363 WARN  [org.jgroups.protocols.UNICAST2] (OOB-9,edg-perf01-16155)
edg-perf01-16155: (requester=edg-perf02-56801) message edg-perf02-56801::8723243 not found
in retransmission table of edg-perf02-56801:
 [8722840 | 8722840 | 8723253] (411 elements, 2 missing)
 16:48:12,327 WARN  [org.jgroups.protocols.UNICAST2] (OOB-67,edg-perf01-16155)
edg-perf01-16155: (requester=edg-perf03-22539) message edg-perf03-22539::9484784 not found
in retransmission table of edg-perf03-22539:
 [9484613 | 9484613 | 9484807] (193 elements, 1 missing)
 16:48:12,794 WARN  [org.jgroups.protocols.UNICAST2] (OOB-53,edg-perf01-16155)
edg-perf01-16155: (requester=edg-perf03-22539) message edg-perf03-22539::9484784 not found
in retransmission table of edg-perf03-22539:
 [9484783 | 9484783 | 9484840] (56 elements, 1 missing)

 I don't know if these messages have any relation to running the jobs with JProfiler,
but I wanted to ask you about them. In this job configuration, 4 nodes are used in the
cluster, (edg-perf01 to edg-perf04) but JProfiler is only attached to the JVM on
edg-perf01.

 Thanks,
 Alan

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009