On 2 Feb 2007, at 18:22, Elias Ross wrote:
--- Manik Surtani <manik(a)jboss.org> wrote:
> Guys,
>
> Playing around with looking at how efficient (or not) it is to store
> nodes as children of other nodes. While this makes logical sense, on
> an implementation level walking the tree sucks - O(n) where n is the
> depth of the tree. This works great for small trees, but as depth
> increases, so does the lookup time.
I spent some time looking at storage issues for maintaining carrier
information on phone numbers. We could only identify the possible
use of
depth 3 (country code, area code, number).
Even classifying every species on the planet only really requires
eight
levels.
The reason you don't need too many depths (usually) is that with 10
children
at each level, the number of nodes you would create with a depth of
10 is
10^10, or 1,000,000,000 leaf nodes, with about half as many
intermediate
nodes.
That is only provided you have as many as 10 children at each level.
And even then, given the application at hand, 1.5 Bn nodes may not be
unfeasible (I'm talking about grid applications for financial
institutions performing risk analyses on stock market data).
Hypothetical for now, I know, but if this does not impact the basic
cases of 3 nodes deep adversely, why not build in the capacity to
deal with more complex and larger structures?
And if you kept a map of FQN -> NODE, you still need to maintain some
structure to enumerate all the children.
Yes, Maps are still maintained per Node, containing child nodes. I'm
not suggesting we do away with this. Just add a more efficient way
of retrieving a Node at an arbitrary depth, without involving
traversing the tree. In terms of memory, this may increase the
footprint somewhat (an extra Map maintaining these references).
One thing that would be nice (and falls under the subject of the
thread)
would be to optimize the size of common cache objects. Being able
to cache
more is very important.
Check out the SizeOf program:
class java.util.concurrent.ConcurrentHashMap: 1271
class java.util.HashMap: 119
class org.jboss.cache.UnversionedNode: 159 (no children)
class org.jboss.cache.UnversionedNode: 1682 (when children() called)
class org.jboss.cache.UnversionedNode: 39 (no "new HashMap()")
ConcurrentHashMap is faster, but uses about 10 times more memory...
I agree. And in quite a few cases may be unnecessary if we assume
proper and robust locking at higher levels. This does need review.
______________________________________________________________________
______________
Want to start your own business?
Learn how on Yahoo! Small Business.
http://smallbusiness.yahoo.com/r-index