<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">On 06/14/2013 11:25 AM, Tomaž Cerar
      wrote:<br>
    </div>
    <blockquote
cite="mid:CAMquZP4+khMF0r9wvi3o=EfE2ycwdowA7fhd90SBRXFijwH2-g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div>
            <div>
              <div>Carlo,<br>
                <br>
              </div>
              what are the most common problems your team found when
              running on bare metal?<br>
              <br>
            </div>
            anything standing out? <br>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    The one problem I see happening a lot is a channel closed
    prematurely. It affects different tests in different ways.<br>
    It also lies at the very core of our functional offering, so if
    possible I would like to see that one tackled.<br>
    <br>
    Carlo<br>
    <br>
    <blockquote
cite="mid:CAMquZP4+khMF0r9wvi3o=EfE2ycwdowA7fhd90SBRXFijwH2-g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>
          <div><br>
          </div>
          From analysis (combination of linux &amp; windows jobs) we
          have on TC we can see something like this:<br>
          <br>
          Most commonly failed, more than 10% of time<br>
          <br>
          org.jboss.as.test.integration.osgi.jta.TransactionTestCase.testUserTransaction
          (this one fails in about 50% of time)<br>
org.jboss.as.test.integration.osgi.deployment.BundleReplaceTestCase.testDirectBundleReplace<br>
          <br>
          Bit less common, &lt; 10% of time (no particular order)<br>
          <br>
org.jboss.as.test.integration.domain.suites.DeploymentManagementTestCase.testFullReplaceViaHash<br>
org.jboss.as.test.manualmode.messaging.HornetQBackupActivationTestCase.testActiveBackupReload<br>
org.jboss.as.test.manualmode.messaging.HornetQBackupActivationTestCase.testLiveReload<br>
          org.jboss.as.test.integration.osgi.jndi.JNDITestCase.testInitialContextFactoryBuilderService
              <br>
org.jboss.as.test.integration.osgi.jndi.JNDITestCase.testObjectFactoryOSGiService<br>
org.jboss.as.test.manualmode.ejb.shutdown.RemoteCallWhileShuttingDownTestCase.testServerShutdownRemoteCall<br>
org.jboss.as.test.clustering.cluster.ejb2.invalidation.CacheInvalidationTestCase(SYNC-tcp).testCacheInvalidation<br>
          <br>
        </div>
        plus few more that fail less commonly.<br>
        <div><br>
          <div>From my point of view, the most annoying one is the
            OSGI's testUserTransaction, Thomas: can you please take a
            look at it?<br>
            <br>
          </div>
          <div>But what can be seen from the list up there, most of
            intermittent test failures are either manual mode tests or
            osgi, everything else is in minority and we are working on
            fixing it.<br>
          </div>
          <div>There use to be lots of problems with clustering, but
            that was mostly fixed.<br>
          </div>
          <div><br>
            --<br>
          </div>
          <div>tomaz<br>
          </div>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Fri, Jun 14, 2013 at 11:05 AM, Carlo
          de Wolf <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:cdewolf@redhat.com" target="_blank">cdewolf@redhat.com</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">There are
            intermittent failures running on bare metal as well. I think<br>
            they are confined to conditions where the machine is
            stressed.<br>
            I'm especially interested in cases where the GC seems to
            come through at<br>
            an ill-timed moment, at which point it looks like
            connections that are<br>
            in use are actually closed asynchronously.<br>
            <span class="HOEnZb"><font color="#888888"><br>
                Carlo<br>
              </font></span>
            <div class="HOEnZb">
              <div class="h5"><br>
                On 06/12/2013 11:38 PM, Andrig Miller wrote:<br>
                &gt; I recently help out on a customer case, which came
                from the Linux side of the house, which had to do with
                Java when virtualized.  The customer produced a test
                case that was emulating the application where the
                performance slowdown was occuring.  It turned out that
                they were using ReentrantReadWriteLock for some
                synchronization.  I did some thread dumps of the test
                case, and looked at the source too, and found out that
                ReentrantReadWriteLock was using the ...Unsafe.park()
                native method, which does a futex() system call.  That
                system call has terrible performance when virtualized,
                compared to using ReentrantLock, which has been changed
                with JDK 7 to no longer depend on the native code, and
                does everything in Java, in user space.  Performance was
                so much better, and with some vitualization
                configuration settings was almost equal to bare metal
                performance.<br>
                &gt;<br>
                &gt; That's a long story, just to say, if we have
                intermittent test suite failures only when run
                virtualized, perhaps we have some Java code that is
                using native code that calls futex() in the OS.  If so,
                it will perform very poorly compared to bare metal, and
                could be part of the problem with intermittent test
                failures, especially if there are any timing issues in
                the test cases.<br>
                &gt;<br>
                &gt; Just an FYI for something to look out for.<br>
                &gt;<br>
                &gt; Andy<br>
                &gt;<br>
                &gt; ----- Original Message -----<br>
                &gt;&gt; From: "Jason Greene" &lt;<a
                  moz-do-not-send="true"
                  href="mailto:jason.greene@redhat.com">jason.greene@redhat.com</a>&gt;<br>
                &gt;&gt; To: <a moz-do-not-send="true"
                  href="mailto:wildfly-dev@lists.jboss.org">wildfly-dev@lists.jboss.org</a><br>
                &gt;&gt; Sent: Wednesday, June 12, 2013 1:01:10 PM<br>
                &gt;&gt; Subject: [wildfly-dev] Call For Help
                (Testsuite)<br>
                &gt;&gt;<br>
                &gt;&gt; We still have a number intermittent test
                failures that have been<br>
                &gt;&gt; around for over a year now. I'm asking for
                everyone's help in doing<br>
                &gt;&gt; what we can to make them stable. If you submit
                a PR, and you see<br>
                &gt;&gt; what looks like an intermittent failure, can
                you do some<br>
                &gt;&gt; investigation and report your findings even if
                it is not your area?<br>
                &gt;&gt; It would be awesome if you can report what you
                find to the mailing<br>
                &gt;&gt; list, and rope in help.<br>
                &gt;&gt;<br>
                &gt;&gt; Nearly all of these seem to only occur when
                virtualization is<br>
                &gt;&gt; involved, so if need be we can work out a plan
                to create either a<br>
                &gt;&gt; special run to capture diagnostic info, or I
                can give access to a<br>
                &gt;&gt; dedicated slave.<br>
                &gt;&gt;<br>
                &gt;&gt; If anyone has any further ideas on how to
                tackle these issues I am<br>
                &gt;&gt; all ears.<br>
                &gt;&gt;<br>
                &gt;&gt; --<br>
                &gt;&gt; Jason T. Greene<br>
                &gt;&gt; WildFly Lead / JBoss EAP Platform Architect<br>
                &gt;&gt; JBoss, a division of Red Hat<br>
                &gt;&gt;<br>
                &gt;&gt;<br>
                &gt;&gt; _______________________________________________<br>
                &gt;&gt; wildfly-dev mailing list<br>
                &gt;&gt; <a moz-do-not-send="true"
                  href="mailto:wildfly-dev@lists.jboss.org">wildfly-dev@lists.jboss.org</a><br>
                &gt;&gt; <a moz-do-not-send="true"
                  href="https://lists.jboss.org/mailman/listinfo/wildfly-dev"
                  target="_blank">https://lists.jboss.org/mailman/listinfo/wildfly-dev</a><br>
                &gt;&gt;<br>
                &gt; _______________________________________________<br>
                &gt; wildfly-dev mailing list<br>
                &gt; <a moz-do-not-send="true"
                  href="mailto:wildfly-dev@lists.jboss.org">wildfly-dev@lists.jboss.org</a><br>
                &gt; <a moz-do-not-send="true"
                  href="https://lists.jboss.org/mailman/listinfo/wildfly-dev"
                  target="_blank">https://lists.jboss.org/mailman/listinfo/wildfly-dev</a><br>
                <br>
                _______________________________________________<br>
                wildfly-dev mailing list<br>
                <a moz-do-not-send="true"
                  href="mailto:wildfly-dev@lists.jboss.org">wildfly-dev@lists.jboss.org</a><br>
                <a moz-do-not-send="true"
                  href="https://lists.jboss.org/mailman/listinfo/wildfly-dev"
                  target="_blank">https://lists.jboss.org/mailman/listinfo/wildfly-dev</a><br>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>