I looked at this in a bit more detail and i think this can be optimized, although i
don't have the exact changes that we need to do. Here's a more detailed version of
the flow:
1) The BaseClassLoader is invoked to load a class
2) The BaseClassloader delegates this to the appropriate BaseClassLoaderDomain
3) The domain has knowledge of the EXPORTs and also the package capabilities (i.e. which
package is available in which jar) of those EXPORTs.
4) The domain then selects a collection of such EXPORTs to identify the correct
classloader out of those.
5) In order to identify the correct loader, it asks each of those, through a
getResource(String classResourceName) call whether it contains that resource. Actually,
this is more of a isResourceAvailable type of boolean call where it doesn't care of
the returned URL.
6) If #5 indicates that the loader contains the necessary resource, then this loader is
selected and scheduled for loading the class.
The problem starts in #5 when the VFSClassLoaderPolicy is used. Specifically when the
VFSClassLoaderPolicy.getResource(String classResourceName) is called, it starts looking in
various "roots" to see if that resource is available within any of its known
roots. While doing this, internally the each of the file handlers are invoked which
ultimately lead to AbstractVirtualFileHandler.structureFindChild(String classResourceName)
first splits the resource path name into multiple tokens (through PathTokenizer). This in
itself is expensive because of the various String operations involved:
PathTokenizer.getTokens():
| public static List<String> getTokens(String path)
| {
| if (path == null)
| throw new IllegalArgumentException("Null path");
|
| int start = -1, length = path.length(), state = STATE_INITIAL;
| char ch;
| List<String> list = new ArrayList<String>();
|
| for (int index = 0; index < length; index ++) {
| ch = path.charAt(index);
| switch (ch) {
| case '/': {
| switch (state) {
| case STATE_INITIAL: {
| // skip extra leading /
| continue;
| }
| case STATE_MAYBE_CURRENT_PATH: {
| // it's '.'
| list.add(CURRENT_PATH);
| state = STATE_INITIAL;
| continue;
| }
| case STATE_MAYBE_REVERSE_PATH: {
| // it's '..'
| list.add(REVERSE_PATH);
| state = STATE_INITIAL;
| continue;
| }
| case STATE_NORMAL: {
| // it's just a normal path segment
| list.add(path.substring(start, index));
| state = STATE_INITIAL;
| continue;
| }
| }
| continue;
| }
| case '.': {
| switch (state) {
| case STATE_INITIAL: {
| // . is the first char; might be a special token
| state = STATE_MAYBE_CURRENT_PATH;
| start = index;
| continue;
| }
| case STATE_MAYBE_CURRENT_PATH: {
| // the second . in a row...
| state = STATE_MAYBE_REVERSE_PATH;
| continue;
| }
| case STATE_MAYBE_REVERSE_PATH: {
| // the third . in a row, guess it's just a weird path name
| if (errorOnSuspiciousTokens) {
| throw new IllegalArgumentException("Illegal suspicious
token in path: " + path);
| }
| state = STATE_NORMAL;
| continue;
| }
| }
| continue;
| }
| default: {
| switch (state) {
| case STATE_INITIAL: {
| state = STATE_NORMAL;
| start = index;
| continue;
| }
| case STATE_MAYBE_CURRENT_PATH:
| case STATE_MAYBE_REVERSE_PATH: {
| if (errorOnSuspiciousTokens) {
| throw new IllegalArgumentException("Illegal suspicious
token in path: " + path);
| }
| state = STATE_NORMAL;
| }
| }
| }
| }
| }
| // handle the last token
| switch (state) {
| case STATE_INITIAL: {
| // trailing /
| break;
| }
| case STATE_MAYBE_CURRENT_PATH: {
| list.add(CURRENT_PATH);
| break;
| }
| case STATE_MAYBE_REVERSE_PATH: {
| list.add(REVERSE_PATH);
| break;
| }
| case STATE_NORMAL: {
| list.add(path.substring(start));
| break;
| }
| }
|
| return list;
| }
|
For each of this token, the control is passed to the FileHandlers. One such handler is the
ZipEntryHandler. And from what i see, the main reason why the loading of a class shows up
as expensive is a result of our ZipEntryHandler. Here are some of the things it does
almost for all operations. But in the current use case let's just consider the
getChild(String path) method:
i) ZipEntryHandler.getChild -> First gets the correct ZipEntryContext and then calls
the getChild on the context. Any operations on the ZipEntryContext go through a isModified
check which involves a call to System.currentTimeInMillis to get the current time and then
comparing it with the earlier time. Do we need to do this modification check? This check
happens almost on all operations on the context:
| VirtualFileHandler getChild(ZipEntryHandler parent, String name)
| {
| if (parent == null)
| throw new IllegalArgumentException("Null parent");
|
| checkIfModified();
|
| String pathName = parent.getLocalPathName();
| if("".equals(pathName))
| pathName = name;
| else
| pathName = pathName + "/" + name;
|
| EntryInfo ei = entries.get(pathName);
| if(ei != null)
| return ei.handler;
|
| return null;
| }
|
|
| private synchronized void checkIfModified()
| {
| // TODO: if zipSource represents a nested archive we should maybe delegate
lastModified to its parent
| if (initStatus == InitializationStatus.NOT_INITIALIZED)
| {
| ensureEntries();
| }
| else if (initStatus == InitializationStatus.INITIALIZED &&
getZipSource().hasBeenModified())
| {
| EntryInfo rootInfo = entries.get("");
| entries.clear();
| entries.put("", rootInfo);
|
| if (getZipSource().exists())
| {
| try
| {
| initEntries();
| }
| catch(Exception ignored)
| {
| log.warn("IGNORING: Failed to reinitialize context: " +
getRootURI(), ignored);
| }
| }
| }
| }
|
ii) Within this flow, there's even an invocation to ZipEntryHandler.isLeaf(). Similar
to the isModified call, this internally checks whether the zip context is closed. The
isLeaf() implementation itself is a bit expensive since it's not a simple boolean
check and instead does a .equals() on AbstractVirtualFileHandler:
ZipEntryContext.isLeaf:
| boolean isLeaf(ZipEntryHandler handler)
| {
| if (handler == null)
| throw new IllegalArgumentException("Null handler");
|
| if (getRoot().equals(handler) == false)
| checkIfModified();
|
| EntryInfo ei = entries.get(handler.getLocalPathName());
| if (ei == null || ei.entry == null)
| return false;
|
| return ei.entry.isDirectory() == false;
| }
|
AbstractVirtualFileHandler.equals()
| public boolean equals(Object obj)
| {
| if (this == obj)
| return true;
| if (obj == null || obj instanceof VirtualFileHandler == false)
| return false;
| VirtualFileHandler other = (VirtualFileHandler) obj;
| if (getVFSContext().equals(other.getVFSContext()) == false)
| return false;
| if (getPathName().equals(other.getPathName()) == false)
| return false;
| return true;
| }
|
|
So overall, in order to find whether a zip (jar) file contains a given class resource
(remember we already know it contains the package, we are just checking whether it
contains the specific class) we do a lot of expensive operations. I haven't completely
looked at the ZipEntryHandler/Context, but would it be possible to have some sort of a
mapping which has the entries contained in that jar? I see that we are doing something
similar in the initEntries() but not sure whether it can be used.
So if we wanted to check if a given resource is available in the ZipEntryContext, maybe we
could just do a entries.containsKey(resourcePath)? Would it be possible to simplify the
zip entry handling?
View the original post :
http://www.jboss.org/index.html?module=bb&op=viewtopic&p=4251266#...
Reply to the post :
http://www.jboss.org/index.html?module=bb&op=posting&mode=reply&a...