"jaikiran" wrote : While doing this, internally the each of the file handlers
are invoked which ultimately lead to AbstractVirtualFileHandler.structureFindChild(String
classResourceName) first splits the resource path name into multiple tokens (through
PathTokenizer). This in itself is expensive because of the various String operations
involved:
|
Accross the whole VFS impl there is a lot of pathname parsing. One way to alleviate this
would be to introduce a Path object that contains an already parsed path. This way all the
methods taking a path in the form of a String could be rewritten to taking Path instead.
Parsing would then happen only once for each path.
"jaikiran" wrote :
| i) ZipEntryHandler.getChild -> First gets the correct ZipEntryContext and then
calls the getChild on the context. Any operations on the ZipEntryContext go through a
isModified check which involves a call to System.currentTimeInMillis to get the current
time and then comparing it with the earlier time. Do we need to do this modification
check?
|
I doubt you can improve performance simply by skipping a few tens of thousands of
System.currentTimeMillis().
For example - the following code executes in 50-60ms on my laptop:
| import java.util.ArrayList;
|
| class Test {
|
| public static void main(String [] args) {
|
| long start = System.currentTimeMillis();
| ArrayList<Long> tstamps = new ArrayList<Long>(100000);
|
| for(int i=0; i<100000; i++) {
| long now = System.currentTimeMillis();
| tstamps.add(now);
| if (now - start > 1000)
| throw new RuntimeException("Your comp is slow, man!");
| }
|
| System.out.println("Time elapsed for " + tstamps.size() + "
iterations: " + (tstamps.get(tstamps.size()-1) - tstamps.get(0)));
| }
| }
|
isModified() check simply ensures that when you say getChild(name) you get the
datastructure reflecting the actual state of the underlying file (someone may have dropped
a modified ear in jboss deploy dir). The really expensive stuff only happens if the
underlying file actually is changed. In this case reinit of children is triggered.
Actually to make sure the datastructure reflects the current state of the file you would
have to hit file's lastModified() method continuously which is pretty expensive.
There's optimization in there that doesn't hit the file more often than once a
second - see ZipWrapper . hasBeenModified(). In fact this should brobably be stated in the
javadoc of this method which currently is not.
This optimization is purposefully done as close to the file communication layer as
possible to avoid introducing unexpected behaviour (read: vfs taking unusually long to
pick up changes in the file).
"jaikiran" wrote :
|
| ii) Within this flow, there's even an invocation to ZipEntryHandler.isLeaf().
Similar to the isModified call, this internally checks whether the zip context is closed.
The isLeaf() implementation itself is a bit expensive since it's not a simple boolean
check and instead does a .equals() on AbstractVirtualFileHandler:
|
| ZipEntryContext.isLeaf:
|
| | boolean isLeaf(ZipEntryHandler handler)
| | {
| | if (handler == null)
| | throw new IllegalArgumentException("Null handler");
| |
| | if (getRoot().equals(handler) == false)
| | checkIfModified();
| |
| | EntryInfo ei = entries.get(handler.getLocalPathName());
| | if (ei == null || ei.entry == null)
| | return false;
| |
| | return ei.entry.isDirectory() == false;
| | }
| |
|
You can't just do a boolean check here because ZipEntryContext represents a mounted
filesystem. What you're saying here is: 'Hey, you zip filesystem, check this file
for me - is it a file, or is it a dir'.
And filesystem goes: "Let's see what he passed to me. Ok .. it's not null. Is
it a zip file - the one that I as a filesystem am taking care of? It's not. In that
case before doing anything else - to reflect the current state of the file on the disk -
let's check if maybe the file was modified and let's reinit ourselves before
continuing to reflect the latest.
Now let's see if we have a file entry for the pathname of the passed handler. Ok, we
have an entry - let's see is it a file?"
It looks kind of convoluted, but it's because it tries to be lazy on one hand,
deferring the really slow stuff until really necessary, and tries to be current on the
other, detecting file changes promptly.
I don't see how you could reimplement AbstractVirtualFileHandler.equals() to stay
semantically correct and take zero cycles to compute :)
"jaikiran" wrote :
| So overall, in order to find whether a zip (jar) file contains a given class resource
(remember we already know it contains the package, we are just checking whether it
contains the specific class) we do a lot of expensive operations. I haven't completely
looked at the ZipEntryHandler/Context, but would it be possible to have some sort of a
mapping which has the entries contained in that jar? I see that we are doing something
similar in the initEntries() but not sure whether it can be used.
|
| So if we wanted to check if a given resource is available in the ZipEntryContext,
maybe we could just do a entries.containsKey(resourcePath)? Would it be possible to
simplify the zip entry handling?
The code was originally written to support any usage scenario. That means that for some
specific usage scenarios it could be optimized. If you know from the outside that the
underlying file won't change, then all the code in vfszip that checks if the file was
changed is redundant for your use-case. So you could for example introduce additional
option UNCHANGABLE or something, and skip continous isModified checks when option is
enabled.
Cheers,
- marko
View the original post :
http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4251421#...
Reply to the post :
http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&a...