[wildfly-dev] Speeding up WildFly boot time

Thu May 18 09:50:24 EDT 2017

On 05/15/2017 05:21 PM, Stuart Douglas wrote:
> On Tue, May 16, 2017 at 1:34 AM, David M. Lloyd <david.lloyd at redhat.com> wrote:
>> Exploding the files out of the JarFile could expose this contention and
>> therefore might be useful as a test - but it would also skew the results
>> a little because you have no decompression overhead, and creating the
>> separate file streams hypothetically might be somewhat more (or less)
>> expensive.  I joked about resurrecting jzipfile (which I killed off
>> because it was something like 20% slower at decompressing entries than
>> Jar/ZipFile) but it might be worth considering having our own JAR
>> extractor at some point with a view towards concurrency gains.  If we go
>> this route, we could go even further and create an optimized module
>> format, which is an idea I think we've looked at a little bit in the
>> past; there are a few avenues of exploration here which could be
>> interesting.
> 
> This could be worth investigating.

Tomaž did a prototype of using the JDK JAR filesystem to back the 
resource loader if it is available; contention did go down but memory 
footprint went up, and overall the additional indexing and allocation 
ended up slowing down boot a little, unfortunately (though large numbers 
of deployments seemed to be faster).  Tomaž can elaborate on his 
findings if he wishes.

I had a look in the JAR FS implementation (and its parent class, the ZIP 
FS implementation, which does most of the hard work), and there are a 
few things which add overhead and contention that we don't need, like 
using read/write locks to manage access and modifications (which we 
don't need) and (synch-based) indexing structures that might be somewhat 
larger than necessary.  They use NIO channels to access the zip data, 
which is probably OK, but maybe mapped buffers could be better... or 
worse?  They use a synchronized list per JAR file to pool Inflaters; 
pooling is a hard thing to do right so maybe there isn't really any 
better option in this case.

But in any event, I think a custom extractor still might be a reasonable 
thing to experiment with.  We could resurrect jzipfile or try a 
different approach (maybe see how well mapped buffers work?).  Since 
we're read-only, any indexes we use can be immutable and thus 
unsynchronized, and maybe more compact as a result. We can use an 
unordered hash table because we generally don't care about file order 
the way that JarFile historically needs to, thus making indexing faster. 
  We could save object allocation overhead by using a specialized 
object->int hash table that just records offsets into the index for each 
entry.

If we try mapped buffers, we could share one buffer concurrently by 
using only methods that accept an offset, and track offsets 
independently.  This would let the OS page cache work for us, especially 
for heavily used JARs.  We would be limited to 2GB JAR files, but I 
don't think that's likely to be a practical problem for us; if it ever 
is, we can create a specialized alternative implementation for huge JARs.

In Java 9, jimages become an option by way of jlink, which will also be 
worth experimenting with (as soon as we're booting on Java 9).

Brainstorm other ideas here!
-- 
- DML