[jbosscache-dev] Optimising node storage in memory
Manik Surtani
manik at jboss.org
Sat Feb 3 05:58:15 EST 2007
On 2 Feb 2007, at 18:17, Galder Zamarreno wrote:
> You said replacing _remove() with a map put??
Not replacing at any stage - just including a map.put() in the _put()
method, and map.remove() in _remove(). Sorry for the confusion.
>
> Would you have intermediate empty nodes? So, if you have /a/b/c/d,
> what would the Map look like? Just an entry for /a/b/c/d? or entry
> for /, entry for /a,...etc and finally entry in the map with /a/b/c/d?
The Node objects will still be created and inserted as per what we do
now. So I guess we still have O(n) for put and remove, but
importantly, O(1) for gets.
Actually, with parent locking - as you correctly pointed out - we'd
still have O(n) for gets as well. Hmm. Unless the locks are
obtained and held not on the Node, but on the Fqn. At the moment in
the lock interceptors we traverse the logical tree twice - once going
down the Fqn tree to get node object names, again to retrieve each
node (actually happening simultaneously at each tree level), and then
locking the node. Why not just lock on the final Fqn, and get the
node from the map (keyed on Fqn)?
>
> Speed of removals and additions would certainly depend on this.
> Without intermediate nodes, they would be O(1), but how would you
> handle parental locks? You could have had two maps, one only with
> populated nodes so that you can get O(1) for additions, removals
> and lookups, and another map with current locking information where
> all nodes would be present, both intermediate and populated nodes?
>
> Galder Zamarreño
> Sr. Software Maintenance Engineer
> JBoss, a division of Red Hat
>
>
> -----Original Message-----
> From: jbosscache-dev-bounces at lists.jboss.org [mailto:jbosscache-dev-
> bounces at lists.jboss.org] On Behalf Of Manik Surtani
> Sent: 02 February 2007 18:18
> To: jbosscache-dev at lists.jboss.org
> Subject: [jbosscache-dev] Optimising node storage in memory
>
> Guys,
>
> Playing around with looking at how efficient (or not) it is to store
> nodes as children of other nodes. While this makes logical sense, on
> an implementation level walking the tree sucks - O(n) where n is the
> depth of the tree. This works great for small trees, but as depth
> increases, so does the lookup time.
>
> Since all locking happens on the interceptor level (based on a
> logical Fqn) isn't the tree structure effectively maintained here?
> What if Node objects weren't directly linked, but maintained in a
> simple HashMap, keyed on Fqn? Instant constant-time lookups,
> regardless of tree depth. Sure, it will mean tuning the HashMap for
> the mem footprint/speed tradeoff, maybe even a custom Map impl tuned
> for this purpose.
>
> And in the initial part the fix should be easy - just replace
> CacheImpl.peek() with a map lookup, and _remove() and _put() with a
> map put.
>
> Will think about this some more, and post later, but for now I wrote
> a quick test class that looped through node creation, lookup and
> deletion through arbitrary depths and these are what the numbers look
> like.
>
> Starting test, with 10000 nodes evenly spread out across the tree.
> All times in millis.
> Testing with depth 10
> Node Creation 7924
> Node Retrieval 1141
> Node Deletion 889
> Testing with depth 20
> Node Creation 16372
> Node Retrieval 2072
> Node Deletion 1524
> Testing with depth 30
> Node Creation 31440
> Node Retrieval 3010
> Node Deletion 2303
> Testing with depth 40
> Node Creation 52238
> Node Retrieval 4172
> Node Deletion 3184
> Testing with depth 50
> Node Creation 80781
> Node Retrieval 35788
> Node Deletion 34107
>
> Arbitrary depths may not be very useful with current use cases but
> thinking of scaling to larger and larger grids and dealing with ever
> larger data volumes, this may be a crucial optimisation. Thoughts
> and comments?
>
> Cheers,
> Manik
>
> FYI, the script to run is tests/scripts/loopFlatteningTest.sh in
> HEAD. Usage: loopFlatteningTest.sh <space-separated list of depths
> to test>, e.g., loopFlatteningTest.sh 10 20 30 40 50
>
>
> --
> Manik Surtani
>
> Lead, JBoss Cache
> JBoss, a division of Red Hat
>
> Email: manik at jboss.org
> Telephone: +44 7786 702 706
> MSN: manik at surtani.org
> Yahoo/AIM/Skype: maniksurtani
>
>
>
> _______________________________________________
> jbosscache-dev mailing list
> jbosscache-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/jbosscache-dev
More information about the jbosscache-dev
mailing list