Definitely worth investigating. I’d like to have a real good understanding of why it has
the benefits it has, so we can see if this is the best way to get them or if something
else is better.
This kicks in just before the ModelController starts and begins parsing the config. The
config parsing quickly gets into parallel work; as soon as the extension elements are
reached the extension modules are loaded concurrently. Then once the parsing is done each
subsystem is installed concurrently, so lots of threads doing concurrent classloading.
So why does adding two more make such a big difference?
Is it that they gets lots of work done in that time when the regular boot thread is not
doing concurrent work, i.e. the parsing and the non-parallel bits of operation execution?
Is it that these threads are just chugging along doing classloading efficiently while the
parallel threads are running along inefficiently getting scheduled and unscheduled?
The latter doesn’t make sense to me as there’s no reason why these threads would be any
more efficient than the others.
- Brian
On May 14, 2017, at 6:36 PM, Stuart Douglas
<stuart.w.douglas(a)gmail.com> wrote:
When JIRA was being screwy on Friday I used the time to investigate an idea I have had
for a while about improving our boot time performance. According to Yourkit the majority
of our time is spent in class loading. It seems very unlikely that we will be able to
reduce the number of classes we load on boot (or at the very least it would be a massive
amount of work) so I investigated a different approach.
I modified ModuleClassLoader to spit out the name and module of every class that is
loaded at boot time, and stored this in a properties file. I then created a simple Service
that starts immediately that uses two threads to eagerly load every class on this list (I
used two threads because that seemed to work well on my laptop, I think
Runtime.availableProcessors()/4 is probably the best amount, but that assumption would
need to be tested on different hardware).
The idea behind this is that we know the classes will be used at some point, and we
generally do not fully utilise all CPU's during boot, so we can use the unused CPU to
pre load these classes so they are ready when they are actually required.
Using this approach I saw the boot time for standalone.xml drop from ~2.9s to ~2.3s on my
laptop. The (super hacky) code I used to perform this test is at
https://github.com/wildfly/wildfly-core/compare/master...stuartwdouglas:b...
I think these initial results are encouraging, and it is a big enough gain that I think
it is worth investigating further.
Firstly it would be great if I could get others to try it out and see if they see similar
gains to boot time, it may be that the gain is very system dependent.
Secondly if we do decide to do this there are two approach that we can use that I can
see:
1) A hard coded list of class names that we generate before a release (basically what the
hack already does), this is simplest, but does add a little bit of additional work to the
release process (although if it is missed it would be no big deal, as
ClassNotFoundException's would be suppressed, and if a few classes are missing the
performance impact is negligible as long as the majority of the list is correct).
2) Generate the list dynamically on first boot, and store it in the temp directory. This
would require the addition of a hook into JBoss Modules to generate the list, but is the
approach I would prefer (as first boot is always a bit slower anyway).
Thoughts?
Stuart
_______________________________________________
wildfly-dev mailing list
wildfly-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/wildfly-dev
--
Brian Stansberry
Manager, Senior Principal Software Engineer
JBoss by Red Hat