[infinispan-dev] The reason(s) why we have so many threads are spawned in the testsuite

Vladimir Blagojevic vblagoje at redhat.com
Tue Oct 6 12:00:57 EDT 2009


Galder,

I agree. I am also experiencing similar problems and looking into thread 
dumps does not help a lot. We have gone overboard with TPE creation, 
should be created only if needed. TPE threads names should be composed 
as: node name + thread pool name + counter.

I can see what can be done about JGroups threads.

Cheers


On 09-10-06 10:59 AM, Galder Zamarreno wrote:
> This is a rather long but please bear with me:
>
> So, I've been looking at the number of threads created in our testsuite
> because quite often I'm getting OOM "unable to create thread" and note
> that tests have evolved to be quite complex with several CacheManagers,
> Caches...etc.
>
> In summary, I see plenty of maxThread=1 ThreadPoolExecutor (TPE)
> instances created that then are either useless (i.e. they get overriden
> after rewiring) or are there even though they're not gonna be used in
> the test (i.e. async listener notifier). Side note: Note that maxThread
> is used for the core pool size which means that this means that there's
> 1 thread available.
>
> For the useless TPEs, we need to avoid creating them in the 1st place.
> For the useful but not used in test, we should have an option that
> allows for such executors to be created lazily in a testsuite env. I see
> the point of creating these in advance in a prod or user env but when
> running the testsuite, it'd be better and would safe resources,
> specially wrt to threads if we created them lazily.
>
> Examples:
>
> For each CacheManager created, 2 x TPEs for ASYNC_NOTIFICATION_EXECUTOR
> are created. The second one during the re-wiring process. The result is
> that the 1st one is never used any more and it's left sitting there. Why
> is a 2nd one created? Something to do with non-volatility? Indeed it is.
> The j.u.e.ExecutorService instance is a j.u.c.TPE which has no
> NonVolatile annotation and hence it gets recreated. Note that this
> happens for all TPE usages (ASYNC_NOTIFICATION_EXECUTOR,
> ASYNC_TRANSPORT_EXECUTOR, EVICTION_SCHEDULED_EXECUTOR,
> ASYNC_REPLICATION_QUEUE_EXECUTOR)! For each one created, there's another
> one left there taking up resources!! So, what are the options? Two:
>
> - A question that seems to come up more and more, get rid of the
> Volatility/NonVolatility difference and treat everything as NonVolatile?
> Apart from simplyfying the code, it'd make easier to debug cos figuring
> out these type of issues are a royal PITA. I remember hearing Manik that
> since create and start phases were gone, this should be no problem.
>
> - Or provide our own ExecutorService and TPE facades that allows use to
> annotate TPE as @NonVolatile.
>
> For each Cache created, the same thing, 2 x TPE for
> ASYNC_NOTIFICATION_EXECUTOR are created.
>
> The lazyness I was refering about earlier comes into play here in the
> sense that, why not start the TPE the 1st time we register an async
> Listener? If we did so, we'd be saving resources if no async listeners
> are enabled (This looks to me the most common case right now). I can see
> the point of having this ready for normal use but certainly for
> testsuite use, this is waste when we're creating so many CacheManagers,
> Caches instances and have several threads running different tests at the
> same time.
>
> Would it make sense to apply such lazyness to ASYNC_TRANSPORT_EXECUTOR?
> Hmmmm, even in a configured SYNC configuration, there might be stuff we
> wanna send asynchronously, i.e. PFERs, so don't think so.
>
> With regards to EVICTION_SCHEDULED_EXECUTOR, this is a cache component.
> Currently, a TPE is created regardless of whether eviction is enabled or
> not and eviction is not a dynamic property. Now, if eviction is set to
> NONE, neither eviction nor expiration happens, correct? In that case, I
> don't see why TPE should be maintained if eviction is disabled.
>
> To finish with the lazy TPE startup topic, with
> ASYNC_REPLICATION_QUEUE_EXECUTOR, we're certainly doing a good job. In
> ReplicationQueueFactory, we only construct a ReplicationQueue and
> corresponding TPE if replication queue is enabled
>
> More comments now. Since there are 4 different usages of
> ThreadPoolExecutor each should have, *by default*, a different thread
> name prefix. Currently, there's no such thing which means that figuring
> out what a particular thread belongs to is difficult unless
> threadNamePrefix are set in each set of properties.
>
> Another thing that is very confusing is the
> DefaultExecutorFactory.counter. What is it trying to represent? For
> example, in the testsuite thread dumps, I was seeing thread names like:
> asyncThread-150 and this is very confusing when taking in account that
> maxThreads in each ThreadPoolExecutor in the testsuite is 1.
>
> Finally, KnownComponentNames should probably be an enum, shouldn't it?
>
> If you made it this far, thx!!
>    




More information about the infinispan-dev mailing list