[jboss-dev-forums] [JBoss Microcontainer Development POJO Server] - Re: VFSClassLoaderPolicy backed jboss-cl performance issue
mstruk
do-not-reply at jboss.com
Tue Aug 25 07:01:27 EDT 2009
"jaikiran" wrote : While doing this, internally the each of the file handlers are invoked which ultimately lead to AbstractVirtualFileHandler.structureFindChild(String classResourceName) first splits the resource path name into multiple tokens (through PathTokenizer). This in itself is expensive because of the various String operations involved:
|
Accross the whole VFS impl there is a lot of pathname parsing. One way to alleviate this would be to introduce a Path object that contains an already parsed path. This way all the methods taking a path in the form of a String could be rewritten to taking Path instead. Parsing would then happen only once for each path.
"jaikiran" wrote :
| i) ZipEntryHandler.getChild -> First gets the correct ZipEntryContext and then calls the getChild on the context. Any operations on the ZipEntryContext go through a isModified check which involves a call to System.currentTimeInMillis to get the current time and then comparing it with the earlier time. Do we need to do this modification check?
|
I doubt you can improve performance simply by skipping a few tens of thousands of System.currentTimeMillis().
For example - the following code executes in 50-60ms on my laptop:
| import java.util.ArrayList;
|
| class Test {
|
| public static void main(String [] args) {
|
| long start = System.currentTimeMillis();
| ArrayList<Long> tstamps = new ArrayList<Long>(100000);
|
| for(int i=0; i<100000; i++) {
| long now = System.currentTimeMillis();
| tstamps.add(now);
| if (now - start > 1000)
| throw new RuntimeException("Your comp is slow, man!");
| }
|
| System.out.println("Time elapsed for " + tstamps.size() + " iterations: " + (tstamps.get(tstamps.size()-1) - tstamps.get(0)));
| }
| }
|
isModified() check simply ensures that when you say getChild(name) you get the datastructure reflecting the actual state of the underlying file (someone may have dropped a modified ear in jboss deploy dir). The really expensive stuff only happens if the underlying file actually is changed. In this case reinit of children is triggered.
Actually to make sure the datastructure reflects the current state of the file you would have to hit file's lastModified() method continuously which is pretty expensive. There's optimization in there that doesn't hit the file more often than once a second - see ZipWrapper . hasBeenModified(). In fact this should brobably be stated in the javadoc of this method which currently is not.
This optimization is purposefully done as close to the file communication layer as possible to avoid introducing unexpected behaviour (read: vfs taking unusually long to pick up changes in the file).
"jaikiran" wrote :
|
| ii) Within this flow, there's even an invocation to ZipEntryHandler.isLeaf(). Similar to the isModified call, this internally checks whether the zip context is closed. The isLeaf() implementation itself is a bit expensive since it's not a simple boolean check and instead does a .equals() on AbstractVirtualFileHandler:
|
| ZipEntryContext.isLeaf:
|
| | boolean isLeaf(ZipEntryHandler handler)
| | {
| | if (handler == null)
| | throw new IllegalArgumentException("Null handler");
| |
| | if (getRoot().equals(handler) == false)
| | checkIfModified();
| |
| | EntryInfo ei = entries.get(handler.getLocalPathName());
| | if (ei == null || ei.entry == null)
| | return false;
| |
| | return ei.entry.isDirectory() == false;
| | }
| |
|
You can't just do a boolean check here because ZipEntryContext represents a mounted filesystem. What you're saying here is: 'Hey, you zip filesystem, check this file for me - is it a file, or is it a dir'.
And filesystem goes: "Let's see what he passed to me. Ok .. it's not null. Is it a zip file - the one that I as a filesystem am taking care of? It's not. In that case before doing anything else - to reflect the current state of the file on the disk - let's check if maybe the file was modified and let's reinit ourselves before continuing to reflect the latest.
Now let's see if we have a file entry for the pathname of the passed handler. Ok, we have an entry - let's see is it a file?"
It looks kind of convoluted, but it's because it tries to be lazy on one hand, deferring the really slow stuff until really necessary, and tries to be current on the other, detecting file changes promptly.
I don't see how you could reimplement AbstractVirtualFileHandler.equals() to stay semantically correct and take zero cycles to compute :)
"jaikiran" wrote :
| So overall, in order to find whether a zip (jar) file contains a given class resource (remember we already know it contains the package, we are just checking whether it contains the specific class) we do a lot of expensive operations. I haven't completely looked at the ZipEntryHandler/Context, but would it be possible to have some sort of a mapping which has the entries contained in that jar? I see that we are doing something similar in the initEntries() but not sure whether it can be used.
|
| So if we wanted to check if a given resource is available in the ZipEntryContext, maybe we could just do a entries.containsKey(resourcePath)? Would it be possible to simplify the zip entry handling?
The code was originally written to support any usage scenario. That means that for some specific usage scenarios it could be optimized. If you know from the outside that the underlying file won't change, then all the code in vfszip that checks if the file was changed is redundant for your use-case. So you could for example introduce additional option UNCHANGABLE or something, and skip continous isModified checks when option is enabled.
Cheers,
- marko
View the original post : http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4251421#4251421
Reply to the post : http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&p=4251421
More information about the jboss-dev-forums
mailing list