[infinispan-dev] Missing elements when using JProfiler...

Bela Ban bban at redhat.com
Wed Apr 10 07:05:54 EDT 2013


A possible explanation for this:

Say we're not finding message 17 in [4 | 15 | 25]: 4 is the lowest 
message we *garbage collected*, 15 the highest we *delivered* and 25 the 
highest we *received* so far.

When different threads sent messages 15 - 18, they could have sent them 
in the order 15 -> 18 -> 16 -> 17. (Messages are only ordered at the 
receiver).

If, *before* 17 was added to the sender's retransmission table, the 
retransmit task at the receiver kicked in, then message #17 would not be 
found in the sender's retransmission table. A few microseconds later, 
#17 would be added and therefore retransmission would pass, although the 
receiver is likely *not* to ask for retransmission of #17 anymore as it 
probably received the message by now.

This is *not* incorrect, but I mitigated it in 3.3.x by dividing message 
gaps at the receiver into 2 groups: old and new, which is something like 
a generational garbage collector, where the most recent missing messages 
are not retransmitted for the first time, only when they 'survived' one 
retransmission. In other words, with 3.3.0.x, you should see far fewer 
of these warnings !




On 4/9/13 5:00 PM, Alan Field wrote:
> Hey Bela,
>
> A couple of weeks ago, I was trying to run the client stress test comparison between JDG and Coherence under JProfiler. These test runs were not successful, because one of the nodes in the cluster would always crash. However, I was also seeing missing element log messages from JGroups, like this: (https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-61-radargun-jdg-vs-coherence-client-stress-test/107/console-edg-perf01/)
>
> 16:48:09,363 WARN  [org.jgroups.protocols.UNICAST2] (OOB-9,edg-perf01-16155) edg-perf01-16155: (requester=edg-perf02-56801) message edg-perf02-56801::8723243 not found in retransmission table of edg-perf02-56801:
> [8722840 | 8722840 | 8723253] (411 elements, 2 missing)
> 16:48:12,327 WARN  [org.jgroups.protocols.UNICAST2] (OOB-67,edg-perf01-16155) edg-perf01-16155: (requester=edg-perf03-22539) message edg-perf03-22539::9484784 not found in retransmission table of edg-perf03-22539:
> [9484613 | 9484613 | 9484807] (193 elements, 1 missing)
> 16:48:12,794 WARN  [org.jgroups.protocols.UNICAST2] (OOB-53,edg-perf01-16155) edg-perf01-16155: (requester=edg-perf03-22539) message edg-perf03-22539::9484784 not found in retransmission table of edg-perf03-22539:
> [9484783 | 9484783 | 9484840] (56 elements, 1 missing)
>
> I don't know if these messages have any relation to running the jobs with JProfiler, but I wanted to ask you about them. In this job configuration, 4 nodes are used in the cluster, (edg-perf01 to edg-perf04) but JProfiler is only attached to the JVM on edg-perf01.
>
> Thanks,
> Alan
>

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)


More information about the infinispan-dev mailing list