Am 06.08.2014 um 19:21 schrieb Andrig Miller <>:

----- Original Message -----
From: "Jason Greene" <>
To: "Andrig Miller" <>
Cc: "Bill Burke" <>,
Sent: Wednesday, August 6, 2014 11:08:02 AM
Subject: Re: [wildfly-dev] Pooling EJB Session Beans per default

On Aug 6, 2014, at 10:49 AM, Andrig Miller <>

----- Original Message -----
From: "Bill Burke" <>
Sent: Wednesday, August 6, 2014 9:30:06 AM
Subject: Re: [wildfly-dev] Pooling EJB Session Beans per default

On 8/6/2014 10:50 AM, Andrig Miller wrote:

----- Original Message -----
From: "Radoslaw Rodak" <>
Sent: Tuesday, August 5, 2014 6:51:03 PM
Subject: Re: [wildfly-dev] Pooling EJB Session Beans per default

Am 06.08.2014 um 00:36 schrieb Bill Burke <>:

On 8/5/2014 3:54 PM, Andrig Miller wrote:
Its a horrible theory. :)  How many EJB instances of a give
created per request?  Generally only 1.  1 instance of one
type!  My $5 bet is that if you went into EJB code and
how many object allocations were made per request, you'd lose
quickly.   Better yet, run a single remote EJB request
tool and let it count the number of allocations for you.  It
greater than 1.  :)

Maybe the StrictMaxPool has an effect on performance because
a global synchronization bottleneck.  Throughput is less and
having less concurrent per-request objects being allocated

The number per request, while relevant is only part of the
The number of concurrent requests happening in the server
dictates the object allocation rate.  Given enough
even a very small number of object allocations per request can
create an object allocation rate that can no longer be

I'm saying that the number of concurrent requests might not
object allocation rate.  There are probably a number of
happen after the EJB instance is obtained.  i.e. interception
contexts, etc.   If StrictMaxPool blocks until a new instance
available, then there would be less allocations per request as
threads would be serialized.

Scenarion 1 )
Let say we have a pool of 100 Stateless EJBs and a constant Load
50 Requests per second  proceeded by 50 EJBs from the pool in
After 1000 seconds how many new EJB Instances will be created
a pool? answer 0 new EJBs  worst case 100 EJB’s in pool… of
object allocation is much higher as of course 1 EJB call leads
many Object from one EJB  but…let see situation without pool.

50 Request/s * 1000 seconds = worst case 50’ 000 EJB Instances
Java heap where 1 EJB might have many objects…   as long as
Collection was not triggered… which sounds to me like faster
JVM heap and having ofter GC probable depending on GC Strategy.

Scenarion 2)
Same as before,  Load is still 50 Requests  per second BUT EJB
call takes 10s.
after 10s we have 500 EJB Instances without pool, after 11s  550
= 540EJB Instances , after 12s  580 EJBs … after some time very
perf…full GC …and mabe OutOfMemory..

So… performance advantage could also turn in to disadvantage :-)

Whoever is investigating StrictMaxPool, or EJB pooling in
stop.  Its pointless.

Agree, pools are outdated…. but something like WorkManager for
max Threads or even better always not less the X idle Threads
be useful :-)


The scenarios above are what is outddated.  Fifty requests per
second isn't any load at all!  We have 100's of thousands of
clients that we have to scale to, and lots more than 50 requests
per second.

What you mean to say is that you need to scale to 100's of
clients on meaningless no-op benchmarks. :)  I do know that that
SpecJ Java EE benchmarks artifically made EJB pooling important as
process intensive calculation results were cached in these
But real-world apps don't use this feature/anti-pattern.

I am not talking about a meaningless no-op benchmark, but a
benchmark that does lots of work.  We don't use meaningless no-op
benchmarks on the performance team, with some exception for
microbenchmarks that we have carefully crafted that model the
interactions for a specific component within the context of how it
is actually used for a real application.

Also however crappy it was, I did implement an EJB container at
in my career.  :)  I know for a fact that there are a number of
per-request internal support objects that need to be allocated.

* The argument array (for reflection)
* Each argument of the method call
* The response object
* Interceptor context object
* The interceptor context attribute map
* EJBContext
* Subject, Principal, role mappings
* Transaction context
* The message object(s) specific to the remote EJB protocol

Starts to add up huh?   I'm probably missing a bunch more.  Throw
interaction with JPA and you end up with even more per-request
being allocated.  You still believe pooling one EJB instance

See John O'Hara's post which shows our non-meaningless benchmark
and the difference that pooling makes vs. non-pooling.  It is a
dramatic difference to say the least.

There is certainly a correlation identified between the results of
this benchmark and the use of pooling. However the underlying cause
of the resulting difference is still unknown. If we knew
definitively how and why this happens it would help in optimizing
this further. As an example, if it turned out to be some secondary
factor, like the throttling aspect of the pool, then eliminating
these allocations (and others) with a zero-tuning approach, like
thread local pooling would offer little to no improvement. If
discovered it is indeed extreme object allocation, and that it came
from thousands of nested calls in a request, then having a temporary
per-request thread local cache would dramatically improve the
results, and be cheap/quick to implement vs a full thread local
solution. If there is a bug in our code somewhere where under
certain situations we create hundreds of objects, when we should be
creating 10s, and the pool covers that up, fixing that bug and
removing the pool could lead to better results. If it turns out
there is only 3% extra churn but that extra churn causes a 10x perf
reduction in GC, then we better understand those limits and
potentially work with the openjdk team in that area.

This conversation is a perfect example of misinformation that
causes us performance and scalability problems within our code

It’s just a surprising result. The pool saves a few allocations, but
it also has the cost of concurrency usage which can trigger
blocking, additional barriers, and busy looping on CAS. You also
still have object churn in the underlying pool data structures that
occurs per invocation since every invocation is a check-out and a
check-in (requires a new node object instance), and if the semaphore
blocks you have additional allocation for the entry in the wait
queue. You factor in the remaining allocation savings relative to
other allocations that are required for the invocation, and it
should be a very small percentage. For that very small percentage to
lead to several times a difference in performance to me hints at
other factors being involved.

All logically thought through.  At a 15% lower transaction rate than we are doing now, we saw 4 Gigabytes per second of object allocation.  We, with Sanne doing most of the work, managed to get that down to 3 Gigabytes per second (I would have loved to get it to 2).  Much of that was Hibernate allocations, and of course that was with pooling on.  We have not spent the time to pinpoint the exact differences, memory and other, between having pooling on vs. off.  Our priority has been continue to scale the workload and fix any problems we see as a result.  We have managed to increase the transaction rate another 15% in the last couple of months, but still have another 17+% to go on a single JVM before we start looking at two JVM's for the testing.  

Once we get to our goal, I would love to put this on our list of tasks, so we can get the specific facts, and instead of talking theory, we will no exactly what can and cannot be done, and whether no pooling could ever match pooled.

I’m lucky my employer spend us this APM tool... this saved me lot of weeks of work!


Jason T. Greene
WildFly Lead / JBoss EAP Platform Architect
JBoss, a division of Red Hat

wildfly-dev mailing list