[infinispan-dev] The reason(s) why we have so many threads are spawned in the testsuite

Galder Zamarreno galder.zamarreno at redhat.com
Tue Oct 6 10:59:20 EDT 2009


This is a rather long but please bear with me:

So, I've been looking at the number of threads created in our testsuite 
because quite often I'm getting OOM "unable to create thread" and note 
that tests have evolved to be quite complex with several CacheManagers, 
Caches...etc.

In summary, I see plenty of maxThread=1 ThreadPoolExecutor (TPE) 
instances created that then are either useless (i.e. they get overriden 
after rewiring) or are there even though they're not gonna be used in 
the test (i.e. async listener notifier). Side note: Note that maxThread 
is used for the core pool size which means that this means that there's 
1 thread available.

For the useless TPEs, we need to avoid creating them in the 1st place. 
For the useful but not used in test, we should have an option that 
allows for such executors to be created lazily in a testsuite env. I see 
the point of creating these in advance in a prod or user env but when 
running the testsuite, it'd be better and would safe resources, 
specially wrt to threads if we created them lazily.

Examples:

For each CacheManager created, 2 x TPEs for ASYNC_NOTIFICATION_EXECUTOR 
are created. The second one during the re-wiring process. The result is 
that the 1st one is never used any more and it's left sitting there. Why 
is a 2nd one created? Something to do with non-volatility? Indeed it is. 
The j.u.e.ExecutorService instance is a j.u.c.TPE which has no 
NonVolatile annotation and hence it gets recreated. Note that this 
happens for all TPE usages (ASYNC_NOTIFICATION_EXECUTOR, 
ASYNC_TRANSPORT_EXECUTOR, EVICTION_SCHEDULED_EXECUTOR, 
ASYNC_REPLICATION_QUEUE_EXECUTOR)! For each one created, there's another 
one left there taking up resources!! So, what are the options? Two:

- A question that seems to come up more and more, get rid of the 
Volatility/NonVolatility difference and treat everything as NonVolatile? 
Apart from simplyfying the code, it'd make easier to debug cos figuring 
out these type of issues are a royal PITA. I remember hearing Manik that 
since create and start phases were gone, this should be no problem.

- Or provide our own ExecutorService and TPE facades that allows use to 
annotate TPE as @NonVolatile.

For each Cache created, the same thing, 2 x TPE for 
ASYNC_NOTIFICATION_EXECUTOR are created.

The lazyness I was refering about earlier comes into play here in the 
sense that, why not start the TPE the 1st time we register an async 
Listener? If we did so, we'd be saving resources if no async listeners 
are enabled (This looks to me the most common case right now). I can see 
the point of having this ready for normal use but certainly for 
testsuite use, this is waste when we're creating so many CacheManagers, 
Caches instances and have several threads running different tests at the 
same time.

Would it make sense to apply such lazyness to ASYNC_TRANSPORT_EXECUTOR? 
Hmmmm, even in a configured SYNC configuration, there might be stuff we 
wanna send asynchronously, i.e. PFERs, so don't think so.

With regards to EVICTION_SCHEDULED_EXECUTOR, this is a cache component. 
Currently, a TPE is created regardless of whether eviction is enabled or 
not and eviction is not a dynamic property. Now, if eviction is set to 
NONE, neither eviction nor expiration happens, correct? In that case, I 
don't see why TPE should be maintained if eviction is disabled.

To finish with the lazy TPE startup topic, with 
ASYNC_REPLICATION_QUEUE_EXECUTOR, we're certainly doing a good job. In 
ReplicationQueueFactory, we only construct a ReplicationQueue and 
corresponding TPE if replication queue is enabled

More comments now. Since there are 4 different usages of 
ThreadPoolExecutor each should have, *by default*, a different thread 
name prefix. Currently, there's no such thing which means that figuring 
out what a particular thread belongs to is difficult unless 
threadNamePrefix are set in each set of properties.

Another thing that is very confusing is the 
DefaultExecutorFactory.counter. What is it trying to represent? For 
example, in the testsuite thread dumps, I was seeing thread names like: 
asyncThread-150 and this is very confusing when taking in account that 
maxThreads in each ThreadPoolExecutor in the testsuite is 1.

Finally, KnownComponentNames should probably be an enum, shouldn't it?

If you made it this far, thx!!
-- 
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache



More information about the infinispan-dev mailing list