Apparently this did not go through to the list the first time, sorry...
Completely agree.
I focused on perf in my last comment but I dont think memory is all that
much different. The declaration of all that state already has to be
accounted for in its current "flattened" parallel-array representation.
The trades off here are:
1) X number of array declarations versus 1
2) overhead of the class definition; again its actual state field memory
footprint is already accounted for so we really are just talking about
small amount of memory here.
Certainly I think its a great idea to try to actually calculate and
compare the memory diffs here. I am pretty confident the difference is
negligible. But either way Hardy's point about higher likelihood of
bugs is the biggest concern. In my experience lack of cohesive
encapsulation is just a recipe for situations where hard to find
problems creep into the code.
On 05/04/2012 05:39 AM, Hardy Ferentschik wrote:
Even taking the risk of pouring oil onto the fire, I think a simpler
data structure wins in most cases over
the parallel arrays. It is much harder to use the latter and easier to make mistakes
which leads to more
bugs and higher maintenance costs.
As Sanne is saying performance questions are tricky. So many thing are happening with the
code our days
before they are getting executed on the bare metal that it is hard to know what
performance impacts a certain
change has. In the end you just have to measure.
Personally I think we should primarily strive for a better and easier to use API.
Oppertunities to optimizes arise
then often naturally.
And now my dear disciples let me close with:
"The First Rule of Program Optimization: Don't do it. The Second Rule of
Program Optimization (for experts only!): Don't do it yet." — Michael A. Jackson
:-)
--Hardy
On May 4, 2012, at 11:58 AM, Sanne Grinovero wrote:
> tricky subject.
> I'm confident that there are many cases in which we should have used
> arrays rather than maps, especially for temporary objects which aren't
> short lived enough (an HashMap living in the scope of a single method
> is going to be cheap). We should have either objects allocated for
> very long (like forever in the scope of the SessionFactory), or very
> short.
>
> In the case of how we keep metadata, I think performance would be
> dominated not that much by the fact it's a slightly bigger object but
> by prefetching and what is going to be available in the cache lines
> you just have filled in: obviously cache is way faster than memory so
> being clever in the sequence you lay out your data structure could
> speed you up by a couple of orders of magnitude.
>
> Using primitives and array matrixes makes the data smaller, hence more
> likely to fit in the cache; but if using an array of objects in which
> each object collects the needed fields in one group, that's likely
> going to be faster.. but I'm making assumptions on how this structure
> is going to be read more frequently.
>
> For example when declaring a matrix as an [ ][ ], performance will be
> very different depending if you read by columns or rows - forgot which
> one is better now - but in that case if the common use case is using
> the slower path it's usually a good idea to invert the matrix.
>
> I'd love it if we could enter this space, or even if it's not suited
> for it, at least be considered "lite":
>
http://stackoverflow.com/questions/10374735/lucene-and-ormlite-on-android
>
> Sanne
>
> On 4 May 2012 10:07, Emmanuel Bernard<emmanuel(a)hibernate.org> wrote:
>> Performance I don't know, you are probably right. But memory wise, that could
be way different.
>> Even ignoring the overhead of the object + pointer in memory, the alignment of
boolean or other small objects would make a significant impact.
>>
>> Of course if we are talking about 20 values, we should not bother. But persisters
and the like store more than 20 values and we have more than one persister / loader. It
might be inconsequential in the end but that might be worth testing.
>>
>> On a related note it's up for debate whether or not putting data in a hash
map for faster lookup later is worth it in all cases:
>>
>> - it takes much more space than raw arrays
>> - array scan might be as fast or faster for a small enough array. As we have seen
in Infinispan and OGM, computing a hash is not a cheap operation.
>>
>> Again this require testing but I am guilty as charge of using collections in
AnnotationBinder when doing some computations that would be better off written as an array
+ array scan.
>>
>>
>> On 3 mai 2012, at 19:32, Steve Ebersole wrote:
>>
>>> I seriously doubt the performance cost of 20 'parallel arrays' versus
1 array of Objects holding those 20 values is anything but negligible at best.
_______________________________________________
hibernate-dev mailing list
hibernate-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev