On 5 Aug 2008, at 01:31, Jason T. Greene wrote:
Manik Surtani wrote:
> Guys,
> I have created 3 new collection impls in trunk - ImmutableListCopy,
> ImmutableSetCopy and ImmutableMapCopy. I have also ported the
> first 2 to branch 2.2.x (the reason why I didnt port the third is
> because the MapCopy which it replaces can get serialized and I
> don't want to break binary compat).
I took a look at adjusting these to be more efficient with the
FastCopyHashMap. After doing some analysis and sanity benchmarks, I
have determined that we should switch to unmodifiableXXX(obj.clone()
or copy constructor) in all cases, for the following reasons:
1. There is no performance advantage with lists.
Agreed, this part was just convenience and perhaps removing a little
bit of superfluous object instantiation.
2. While the construction time in the Map and Set variants is
faster, since it doesn't use hashing, it sacrifices way too much
(O(n) lookup times). I did verify that the performance difference
will be the same with just about every map implementation, since
it's mostly allocation that is being measured here.
These were originally optimised for iteration only, but I expected
lookups to be a problem - hence my comment below.
3. It prevents the copy optimization in FastCopyHashMap, so removing
usage of ImmutableMapCopy will improve copy performance of anything
using FastCopyHashMap.
Thats what I suspected, and hence my thoughts on extending
FastCopyHashMap to an ImmutableFastCopyHashMap - so we have fast
construction, decent lookup times, immutability as well as reduce
unnecessary object construction/wrapping.
4. The construction of the immutable wrapper is barely noticeable,
and it's done per call, so the value is likely temporary.
Depends. This does happen in many different places, for different
reasons:
- retrieval of collections from Node or Cache - which means the
reference goes into userland and may not be temporary
- used to pass data values into cache listeners. Again, may not be
temporary.
- Other internal uses are likely to be temporary though, such as
retrieving of invocation or transaction locks. These *may* not be
immutable if code written around it is careful, but for the sake of
correctness and thoroughness I have made these immutable.
Also I noticed that we have some duplicate copying going on. For
example , CacheInvocationDelegate copies a set that is constructed
by the command, per call.
It makes sense for CID to provide a wrapIfNecessary transformation,
which only defensively copies/wraps the collection to be returned if
it has not already been wrapped/copied elsewhere in the chain, e.g.,
in the command.
What I would like to do, provided everyone is ok with this, is
switch this code to static factory methods like so:
Immutables.immutableCopy(Map map);
Immutables.immutableCopy(List list);
Immutables.immutableCopy(Set set);
These will then just call clone and Collections.unmodifiableXXX as
appropriate. We can then add optimizations later if they become
available.
Would these be more generic, to suit any collection implementation?
Remember that not all Maps, for example, implement Cloneable or have a
copy constructor. And wrapping any Map passed in as a FastCopyHashMap
may break correctness, for example, if a SortedMap were passed in.
Personally, my preference would be to:
1. Stick with the ImmutableListCopy - there are no drawbacks here,
although the gain is small as well (just save on object construction).
2. Implement ImmutableFastCopyHashMap and an immutable Set based on
this. Benefits detailed above.
3. Regarding the duplicate copying, this should be addressed
regardless of collection implementation/approach used.
Thoughts?
--
Manik Surtani
Lead, JBoss Cache
manik(a)jboss.org