On 6 Dec 2009, at 13:39, Sanne Grinovero wrote:
Hello,
Looking into hudson's tests on
http://hudson.jboss.org/hudson/view/Infinispan/job/Infinispan-trunk-JDK6-tcp
it appears the number of failed tests change from build to build even
by documentation changes or other unrelated changes.
Two examples:
Build #1031 :
Changes: Typos in javadocs
Test Result (4 failures / -1)
a javadoc change fixed a test?
Build #1033 :
Changes: [ISPN-301] (Closing the Lucene Directory will close the cache too)
Test Result (8 failures / +2)
So while I only changed something related to Lucene, the failures in
core increased?
In some of the test errors you can find evidence of communication
between test scenarios, like these:
http://hudson.jboss.org/hudson/view/Infinispan/job/Infinispan-trunk-JDK6-...
http://hudson.jboss.org/hudson/view/Infinispan/job/Infinispan-trunk-JDK6-...
first one from the Lucene module, second from the Tree module: they
both have the node vmg22.mw.lab.eng.bos.redhat.com-42912
and are complaining about an unexpected number of participants.
Looking into the other errors it always looks like as "someone else"
changed the cache, but there's no evidence, so I think we should solve
the isolation problem first?
Both stacktraces show that they're using
org.infinispan.test.MultipleCacheManagersTest.createClusteredCaches(MultipleCacheManagersTest.java:137)
to setup the caches, so it doesn't appear to be a problem with these
two testcases.
This kind of interactions don't seem to happen inside a single module,
could it be a classloader problem? A static threadlocal defines the
jgroups port to use, but it's "static" in a per-module world, instead
of globally static?
I've added some logging to
org.infinispan.test.fwk.JGroupsConfigBuilder, 2 snippets of the
result:
[org.infinispan.test.fwk.JGroupsConfigBuilder] (pool-1-thread-1) TCP
bind_port:7900 ClassLoder:org.apache.maven.surefire.booter.IsolatedClassLoader@2e93d13f
[...many lines..]
[org.infinispan.test.fwk.JGroupsConfigBuilder] (pool-1509-thread-10)
TCP bind_port:7900
ClassLoder:org.apache.maven.surefire.booter.IsolatedClassLoader@2d1a2259
I see the two classloaders being different, and while the threads are
different they are sharing the same bind_port 7900.
Wow, great detective work tracking this down! Thanks for looking into this!
Interestingly though, isn't each module test suite run sequentially, one after
another? So does it matter that they use separate class loaders and hence separate
contents of static variables?
Looking into
http://maven.apache.org/plugins/maven-surefire-plugin/examples/class-load...
It looks like from Surefire 2.4.3 the default is to use a shared
system classloader, but there's a little warning at the bottom of page
about not being possible to not isolate the classloader while using
forkMode=none
ideas?
P.S. where can I get the sources of maven-surefire-plugin version2.4.3-JBOSS ?
https://svn.jboss.org/repos/maven/plugins/jboss/trunk/maven-jboss-surefire
or
http://anonsvn.jboss.org/repos/maven/plugins/jboss/trunk/maven-jboss-sure...
Cheers
Manik
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org