[jboss-jira] [JBoss JIRA] Commented: (JGRP-497) Message bundling seems to add latency well beyond max_bundle_timeout

Bela Ban (JIRA) jira-events at lists.jboss.org
Thu May 3 15:27:41 EDT 2007


    [ http://jira.jboss.com/jira/browse/JGRP-497?page=comments#action_12361367 ] 
            
Bela Ban commented on JGRP-497:
-------------------------------

The unit test is MessageBundlingTest

> Message bundling seems to add latency well beyond max_bundle_timeout
> --------------------------------------------------------------------
>
>                 Key: JGRP-497
>                 URL: http://jira.jboss.com/jira/browse/JGRP-497
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.4.1 SP3
>            Reporter: Brian Stansberry
>         Assigned To: Bela Ban
>             Fix For: 2.5
>
>
> Short synopsis: with bundling enabled and max_bundle_timeout=30 ms, I'm sometimes seeing 700 ms delay in receiver getting a message, leading to transient AS testsuite failures.  Disabling bundling makes the transient failures go away.
> Long discussion:
> The JBoss AS testsuite has been seeing intermittent failures of the asynchronous web session replication tests. Particularly with FIELD granularity tests. Basically, test modifies a session on one node, waits 500 ms, then fails over to the other node, expecting consistent state. Test fails if the session state is not as expected.
> Whenever I investigate the intermittent failure, it's always a case of the asynchronous replication message arriving after the failover request. TRACE logging of JBoss Cache shows sometimes a 700 ms delay between the sender cache sending the replication and the receiver receiving it.  That's just too long!
> Causes I could think of:
> 1) Some up_thread/down_thread set to true, leaving a message sitting in a queue for a while until the OS schedules the thread. We used to see this problem.  Nope -- all threads are set to false.
> 2) Bad luck; full gc happens at the wrong time.  Possible but IMO unlikely; the failures occur too often and its not like these tests are generating a ton of garbage that's forcing a lot of full gc runs.
> 3) System is under some other load during the relevant period. Unlikely.  The client is sleeping and the servers have nothing else going on.
> 4) Message bundling. It's turned on, but max_bundle_timeout is 30 ms, so the latency it adds to an async RPC should be minimal.  But, I just disabled bundling and have now run the async FIELD tests about 10 times with no failures.  With it enabled I'd get a failure in some test on average nearly once per run.
> Perhaps there is something that's preventing the Bundler task executing on the expected schedule?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list