On Dec 10, 2009, at 10:40 AM, Manik Surtani wrote:
On 9 Dec 2009, at 23:07, Sanne Grinovero wrote:
> Hi Manik,
> sorry for late answer, this got stuck in drafts box some days:
>
> You're right as the modules are built and tested sequentially it
> shouldn't matter, still it looks like they interact and I have no
> ideas about how that's possible.
> Maybe the cause is simpler, like some test having a leftover cache,
> not properly terminated?
> It could be the case that the shutdown hook of CacheManager is being
> run only at the end of last test, as there's no real shutdown between
> modules, so if some test doesn't properly cleanup
> this could explain the higher-than-expected number of nodes. Will try
> debugging a bit more.
Great, thanks! Feel free to ping myself or mmarkus on IRC if you want to chat about this
some more/bounce ideas around.
I wasn't able to look at the logs as hudson is
"temporarily" unavailable.
I think I might know what the problem might be, though: if a test fails and didn't
clean up resources properly (in this case destroy the cache managers) then next test
running on the same thread will see the already existing cluster, and blockUntil.. methods
will fail.
Fixing (i.e. make it to Destroy the CM correctly) the original test should fix the
problem, but I think the test FWK should be more accurate in reporting the issue (I've
already seen this raised several times), so I've created a JIRA to fix it.
https://jira.jboss.org/jira/browse/ISPN-314
>
> Cheers,
> Sanne
>
>
> 2009/12/7 Manik Surtani <manik(a)jboss.org>:
>>
>> On 6 Dec 2009, at 13:39, Sanne Grinovero wrote:
>>
>>> Hello,
>>> Looking into hudson's tests on
>>>
http://hudson.jboss.org/hudson/view/Infinispan/job/Infinispan-trunk-JDK6-tcp
>>> it appears the number of failed tests change from build to build even
>>> by documentation changes or other unrelated changes.
>>> Two examples:
>>>
>>> Build #1031 :
>>> Changes: Typos in javadocs
>>> Test Result (4 failures / -1)
>>> a javadoc change fixed a test?
>>>
>>> Build #1033 :
>>> Changes: [ISPN-301] (Closing the Lucene Directory will close the cache too)
>>> Test Result (8 failures / +2)
>>> So while I only changed something related to Lucene, the failures in
>>> core increased?
>>>
>>> In some of the test errors you can find evidence of communication
>>> between test scenarios, like these:
>>>
http://hudson.jboss.org/hudson/view/Infinispan/job/Infinispan-trunk-JDK6-...
>>>
http://hudson.jboss.org/hudson/view/Infinispan/job/Infinispan-trunk-JDK6-...
>>>
>>> first one from the Lucene module, second from the Tree module: they
>>> both have the node vmg22.mw.lab.eng.bos.redhat.com-42912
>>> and are complaining about an unexpected number of participants.
>>> Looking into the other errors it always looks like as "someone
else"
>>> changed the cache, but there's no evidence, so I think we should solve
>>> the isolation problem first?
>>>
>>> Both stacktraces show that they're using
>>>
org.infinispan.test.MultipleCacheManagersTest.createClusteredCaches(MultipleCacheManagersTest.java:137)
>>> to setup the caches, so it doesn't appear to be a problem with these
>>> two testcases.
>>>
>>> This kind of interactions don't seem to happen inside a single module,
>>> could it be a classloader problem? A static threadlocal defines the
>>> jgroups port to use, but it's "static" in a per-module world,
instead
>>> of globally static?
>>> I've added some logging to
>>> org.infinispan.test.fwk.JGroupsConfigBuilder, 2 snippets of the
>>> result:
>>> [org.infinispan.test.fwk.JGroupsConfigBuilder] (pool-1-thread-1) TCP
>>> bind_port:7900
ClassLoder:org.apache.maven.surefire.booter.IsolatedClassLoader@2e93d13f
>>> [...many lines..]
>>> [org.infinispan.test.fwk.JGroupsConfigBuilder] (pool-1509-thread-10)
>>> TCP bind_port:7900
>>> ClassLoder:org.apache.maven.surefire.booter.IsolatedClassLoader@2d1a2259
>>>
>>> I see the two classloaders being different, and while the threads are
>>> different they are sharing the same bind_port 7900.
>>
>> Wow, great detective work tracking this down! Thanks for looking into this!
>>
>> Interestingly though, isn't each module test suite run sequentially, one
after another? So does it matter that they use separate class loaders and hence separate
contents of static variables?
>>
>>> Looking into
http://maven.apache.org/plugins/maven-surefire-plugin/examples/class-load...
>>> It looks like from Surefire 2.4.3 the default is to use a shared
>>> system classloader, but there's a little warning at the bottom of page
>>> about not being possible to not isolate the classloader while using
>>> forkMode=none
>>>
>>> ideas?
>>>
>>> P.S. where can I get the sources of maven-surefire-plugin version2.4.3-JBOSS
?
>>
>>
https://svn.jboss.org/repos/maven/plugins/jboss/trunk/maven-jboss-surefire
>>
>> or
>>
>>
http://anonsvn.jboss.org/repos/maven/plugins/jboss/trunk/maven-jboss-sure...
>>
>> Cheers
>> Manik
>> --
>> Manik Surtani
>> manik(a)jboss.org
>> Lead, Infinispan
>> Lead, JBoss Cache
>>
http://www.infinispan.org
>>
http://www.jbosscache.org
>>
>>
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev