Re: [wildfly-dev] Call For Help (Testsuite)

Friday, 14 June 2013

Carlo,

what are the most common problems your team found when running on bare
metal?

anything standing out?

...
From analysis (combination of linux & windows jobs) we have on TC
we can see something like this:

Most commonly failed, more than 10% of time

org.jboss.as.test.integration.osgi.jta.TransactionTestCase.testUserTransaction
(this one fails in about 50% of time)
org.jboss.as.test.integration.osgi.deployment.BundleReplaceTestCase.testDirectBundleReplace

Bit less common, < 10% of time (no particular order)

org.jboss.as.test.integration.domain.suites.DeploymentManagementTestCase.testFullReplaceViaHash
org.jboss.as.test.manualmode.messaging.HornetQBackupActivationTestCase.testActiveBackupReload
org.jboss.as.test.manualmode.messaging.HornetQBackupActivationTestCase.testLiveReload
org.jboss.as.test.integration.osgi.jndi.JNDITestCase.testInitialContextFactoryBuilderService

org.jboss.as.test.integration.osgi.jndi.JNDITestCase.testObjectFactoryOSGiService
org.jboss.as.test.manualmode.ejb.shutdown.RemoteCallWhileShuttingDownTestCase.testServerShutdownRemoteCall
org.jboss.as.test.clustering.cluster.ejb2.invalidation.CacheInvalidationTestCase(SYNC-tcp).testCacheInvalidation

plus few more that fail less commonly.

...
From my point of view, the most annoying one is the OSGI's
testUserTransaction, Thomas: can you please take a look at it?

But what can be seen from the list up there, most of intermittent test
failures are either manual mode tests or osgi, everything else is in
minority and we are working on fixing it.
There use to be lots of problems with clustering, but that was mostly fixed.

--
tomaz

On Fri, Jun 14, 2013 at 11:05 AM, Carlo de Wolf <cdewolf(a)redhat.com&gt; wrote:

...
 There are intermittent failures running on bare metal as well. I
think
 they are confined to conditions where the machine is stressed.
 I'm especially interested in cases where the GC seems to come through at
 an ill-timed moment, at which point it looks like connections that are
 in use are actually closed asynchronously.

 Carlo

 On 06/12/2013 11:38 PM, Andrig Miller wrote:
 > I recently help out on a customer case, which came from the Linux side
 of the house, which had to do with Java when virtualized.  The customer
 produced a test case that was emulating the application where the
 performance slowdown was occuring.  It turned out that they were using
 ReentrantReadWriteLock for some synchronization.  I did some thread dumps
 of the test case, and looked at the source too, and found out that
 ReentrantReadWriteLock was using the ...Unsafe.park() native method, which
 does a futex() system call.  That system call has terrible performance when
 virtualized, compared to using ReentrantLock, which has been changed with
 JDK 7 to no longer depend on the native code, and does everything in Java,
 in user space.  Performance was so much better, and with some vitualization
 configuration settings was almost equal to bare metal performance.
 >
 > That's a long story, just to say, if we have intermittent test suite
 failures only when run virtualized, perhaps we have some Java code that is
 using native code that calls futex() in the OS.  If so, it will perform
 very poorly compared to bare metal, and could be part of the problem with
 intermittent test failures, especially if there are any timing issues in
 the test cases.
 >
 > Just an FYI for something to look out for.
 >
 > Andy
 >
 > ----- Original Message -----
 >> From: "Jason Greene" <jason.greene(a)redhat.com&gt;
 >> To: wildfly-dev(a)lists.jboss.org
 >> Sent: Wednesday, June 12, 2013 1:01:10 PM
 >> Subject: [wildfly-dev] Call For Help (Testsuite)
 >>
 >> We still have a number intermittent test failures that have been
 >> around for over a year now. I'm asking for everyone's help in doing
 >> what we can to make them stable. If you submit a PR, and you see
 >> what looks like an intermittent failure, can you do some
 >> investigation and report your findings even if it is not your area?
 >> It would be awesome if you can report what you find to the mailing
 >> list, and rope in help.
 >>
 >> Nearly all of these seem to only occur when virtualization is
 >> involved, so if need be we can work out a plan to create either a
 >> special run to capture diagnostic info, or I can give access to a
 >> dedicated slave.
 >>
 >> If anyone has any further ideas on how to tackle these issues I am
 >> all ears.
 >>
 >> --
 >> Jason T. Greene
 >> WildFly Lead / JBoss EAP Platform Architect
 >> JBoss, a division of Red Hat
 >>
 >>
 >> _______________________________________________
 >> wildfly-dev mailing list
 >> wildfly-dev(a)lists.jboss.org
 >> https://lists.jboss.org/mailman/listinfo/wildfly-dev
 >>
 > _______________________________________________
 > wildfly-dev mailing list
 > wildfly-dev(a)lists.jboss.org
 > https://lists.jboss.org/mailman/listinfo/wildfly-dev

 _______________________________________________
 wildfly-dev mailing list
 wildfly-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/wildfly-dev

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [wildfly-dev] Call For Help (Testsuite)