[jboss-dev] Further profling: Where should I focus?

Bill Burke bburke at redhat.com
Tue Jan 5 20:01:08 EST 2010



David M. Lloyd wrote:
> On 01/05/2010 04:39 PM, Bill Burke wrote:
>> David M. Lloyd wrote:
>>> That's fine BUT a general-purpose caching system will just fuck things up
>>> again.
>> With 3.5 years of development you're telling me that our testsuite isn't
>> good enough to pick up errors introduced by a "general-purpose caching
>> system" or other optimizations?
> 
> Yup.  The test suite has historically had a very difficult time locating 
> problems which have cropped up in the VFS2 implementation.  I guarantee 
> that applying a general solution without understanding each use case will 
> result in crappy code and time-bomb style errors.  You can't produce 
> bug-free code just by writing regression tests.
> 
>>> If you know the info won't change, then don't keep asking for it.
>> Easier said than done.
>>
>> Take for instance VirtulFile.visit().  Any visitor that is run calls
>> isDirectory().  For each isDirectory() there is at least one
>> ConcurrentHashMap lookup to find the mount, then to delegate to the
>> FileSystem interface which itself does hash lookups.
> 
> Can't be avoided.  The topology of mounts may change at any time, which 
> means that one call to isDirectory() might give a different answer than the 
> next for the same file.

Come on, when would any VirtualFile that is a directory *EVER* become a 
VirtualFile that is not a directory?



> How can this possibly be cached when things can be 
> mounted and unmounted at will, by threads other than your own?  A 
> VirtualFile is just a symbolic name after all - it does not and can not 
> represent any kind of link into the physical filesystem itself since such a 
> link cannot exist with anything resembling consistent semantics.
> 

This is an app server not an operating system.  Its pretty simple.  You 
invalidate the cache for the remounted/unmounted item.

> It might be worthwhile to explore using some type of COW HashMap setup 
> instead of CHM, since there are relatively few mounts compared to lookups.
> 
> In any case, O(1) + O(1) = O(1) so algorithmically it's sound.  It's just a 
> question of optimization at this stage, if there are any which can be done 
> without breaking the API contract.
> 
>> Then, visit() continues with calling getChildren() if the file is a
>> directory.  getChildren() recreates a totally brand new VirtualFile
>> list, by, yes, doing N more hash lookups for each child entry.
> 
> You're talking about the creation of a list of objects which have three 
> reference fields and one integer field.  It's cheap.  I already benchmarked 
> the piss out of it, and I'm sure it's not the source of any significant 
> performance issue unless you're doing something crazy.
> 

So what was acceptable to you?  My bet is your bar is much higher than 
mine.  I know we can get the default profile booting in 10 seconds or 
less.  ~200ms/5meg-jar for initializing a VirtualFile is just 
unacceptable for VFS.

For example I just optimized VirtualFile.isDirectory(), isFile(), and 
getMount() to cache (and invalidate the cache on a mount).  In addition, 
I modified VirtualFile.getChildren() to have the FileSystem create the 
list of VirtualFiles.  Doing so, allowed me to pre-initialize the 
isDirectory cache of the VirtualFile.  I just got a 12% improvement 
(30ms).  This improvement was entirely within determinePackageNames, but 
still.  30ms * 3 (15 meg of jars) * 2 (Packages + Subdeploy scans) == 
200ms less of App Server boot time.  Not much, but its a start.


> That's rather disingenuous.  VFS failed to get correctness right after 3-4 
> years, hence it has been rewritten with correctness as the primary goal. 

A complete rewrite will have the same consequences as any optimization.

 >
 > If you want the feature, you pay the price.  The requirements are always
 > open for debate, I suppose.
 >

My goal is pretty simple.  For AS not to suck.  Paying drastically for 
these features that VFS is supposed to enable doesn't get us anywhere, 
especially when JBoss AS 4.x and earlier got the job done with FAR LESS 
abstraction (and CPU).

-- 
Bill Burke
JBoss, a division of Red Hat
http://bill.burkecentral.com



More information about the jboss-development mailing list