[infinispan-dev] The reason(s) why we have so many threads are spawned in the testsuite

Manik Surtani manik at jboss.org
Thu Oct 8 07:20:38 EDT 2009


On 6 Oct 2009, at 15:59, Galder Zamarreno wrote:

> This is a rather long but please bear with me:
>
> So, I've been looking at the number of threads created in our  
> testsuite
> because quite often I'm getting OOM "unable to create thread" and note
> that tests have evolved to be quite complex with several  
> CacheManagers,
> Caches...etc.
>
> In summary, I see plenty of maxThread=1 ThreadPoolExecutor (TPE)
> instances created that then are either useless (i.e. they get  
> overriden
> after rewiring) or are there even though they're not gonna be used in
> the test (i.e. async listener notifier). Side note: Note that  
> maxThread
> is used for the core pool size which means that this means that  
> there's
> 1 thread available.
>
> For the useless TPEs, we need to avoid creating them in the 1st place.

True.  We could do this lazily via a layer of indirection.

> For the useful but not used in test, we should have an option that
> allows for such executors to be created lazily in a testsuite env. I  
> see
> the point of creating these in advance in a prod or user env but when
> running the testsuite, it'd be better and would safe resources,
> specially wrt to threads if we created them lazily.

+1.  I'll have a think about this and re-post later today.

> Examples:
>
> For each CacheManager created, 2 x TPEs for  
> ASYNC_NOTIFICATION_EXECUTOR
> are created. The second one during the re-wiring process. The result  
> is
> that the 1st one is never used any more and it's left sitting there.  
> Why
> is a 2nd one created? Something to do with non-volatility? Indeed it  
> is.
> The j.u.e.ExecutorService instance is a j.u.c.TPE which has no
> NonVolatile annotation and hence it gets recreated. Note that this
> happens for all TPE usages (ASYNC_NOTIFICATION_EXECUTOR,
> ASYNC_TRANSPORT_EXECUTOR, EVICTION_SCHEDULED_EXECUTOR,
> ASYNC_REPLICATION_QUEUE_EXECUTOR)! For each one created, there's  
> another
> one left there taking up resources!! So, what are the options? Two:
>
> - A question that seems to come up more and more, get rid of the
> Volatility/NonVolatility difference and treat everything as  
> NonVolatile?
> Apart from simplyfying the code, it'd make easier to debug cos  
> figuring
> out these type of issues are a royal PITA. I remember hearing Manik  
> that
> since create and start phases were gone, this should be no problem.

Again, I will revisit this today - about time.  :-)

> - Or provide our own ExecutorService and TPE facades that allows use  
> to
> annotate TPE as @NonVolatile.
>
> For each Cache created, the same thing, 2 x TPE for
> ASYNC_NOTIFICATION_EXECUTOR are created.
>
> The lazyness I was refering about earlier comes into play here in the
> sense that, why not start the TPE the 1st time we register an async
> Listener? If we did so, we'd be saving resources if no async listeners
> are enabled (This looks to me the most common case right now). I can  
> see
> the point of having this ready for normal use but certainly for
> testsuite use, this is waste when we're creating so many  
> CacheManagers,
> Caches instances and have several threads running different tests at  
> the
> same time.
>
> Would it make sense to apply such lazyness to  
> ASYNC_TRANSPORT_EXECUTOR?
> Hmmmm, even in a configured SYNC configuration, there might be stuff  
> we
> wanna send asynchronously, i.e. PFERs, so don't think so.
>
> With regards to EVICTION_SCHEDULED_EXECUTOR, this is a cache  
> component.
> Currently, a TPE is created regardless of whether eviction is  
> enabled or
> not and eviction is not a dynamic property. Now, if eviction is set to
> NONE, neither eviction nor expiration happens, correct? In that  
> case, I
> don't see why TPE should be maintained if eviction is disabled.
>
> To finish with the lazy TPE startup topic, with
> ASYNC_REPLICATION_QUEUE_EXECUTOR, we're certainly doing a good job. In
> ReplicationQueueFactory, we only construct a ReplicationQueue and
> corresponding TPE if replication queue is enabled
>
> More comments now. Since there are 4 different usages of
> ThreadPoolExecutor each should have, *by default*, a different thread
> name prefix. Currently, there's no such thing which means that  
> figuring
> out what a particular thread belongs to is difficult unless
> threadNamePrefix are set in each set of properties.

Yes, there should be a way of handling this - perhaps related to the  
component name if it isn't overridden with a property.  I'll have a  
look at this too.

> Another thing that is very confusing is the
> DefaultExecutorFactory.counter. What is it trying to represent?

Restarts of the cache in the same VM.  It would look confusing now,  
but with proper naming (as suggested above) it should be clearer.

> For
> example, in the testsuite thread dumps, I was seeing thread names  
> like:
> asyncThread-150 and this is very confusing when taking in account that
> maxThreads in each ThreadPoolExecutor in the testsuite is 1.
>
> Finally, KnownComponentNames should probably be an enum, shouldn't it?

+1.

> If you made it this far, thx!!

Thanks for the detailed analysis - useful stuff.  :)

Cheers
--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org







More information about the infinispan-dev mailing list