[hibernate-dev] Hibernate Developer IRC meeting - 5/03

Hardy Ferentschik hardy at hibernate.org
Fri May 4 06:39:34 EDT 2012


Even taking the risk of pouring oil onto the fire, I think a simpler data structure wins in most cases over
the parallel arrays. It is much harder to use the latter and easier to make mistakes which leads to more
bugs and higher maintenance costs. 

As Sanne is saying performance questions are tricky. So many thing are happening with the code our days
before they are getting executed on the bare metal that it is hard to know what performance impacts a certain 
change has. In the end you just have to measure. 

Personally I think we should primarily strive for a better and easier to use API. Oppertunities to optimizes arise 
then often naturally. 

And now my dear disciples let me close with:
 "The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet." — Michael A. Jackson

:-)

--Hardy

On May 4, 2012, at 11:58 AM, Sanne Grinovero wrote:

> tricky subject.
> I'm confident that there are many cases in which we should have used
> arrays rather than maps, especially for temporary objects which aren't
> short lived enough (an HashMap living in the scope of a single method
> is going to be cheap). We should have either objects allocated for
> very long (like forever in the scope of the SessionFactory), or very
> short.
> 
> In the case of how we keep metadata, I think performance would be
> dominated not that much by the fact it's a slightly bigger object but
> by prefetching and what is going to be available in the cache lines
> you just have filled in: obviously cache is way faster than memory so
> being clever in the sequence you lay out your data structure could
> speed you up by a couple of orders of magnitude.
> 
> Using primitives and array matrixes makes the data smaller, hence more
> likely to fit in the cache; but if using an array of objects in which
> each object collects the needed fields in one group, that's likely
> going to be faster.. but I'm making assumptions on how this structure
> is going to be read more frequently.
> 
> For example when declaring a matrix as an [ ][ ], performance will be
> very different depending if you read by columns or rows - forgot which
> one is better now - but in that case if the common use case is using
> the slower path it's usually a good idea to invert the matrix.
> 
> I'd love it if we could enter this space, or even if it's not suited
> for it, at least be considered "lite":
> http://stackoverflow.com/questions/10374735/lucene-and-ormlite-on-android
> 
> Sanne
> 
> On 4 May 2012 10:07, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>> Performance I don't know, you are probably right. But memory wise, that could be way different.
>> Even ignoring the overhead of the object + pointer in memory, the alignment of boolean or other small objects would make a significant impact.
>> 
>> Of course if we are talking about 20 values, we should not bother. But persisters and the like store more than 20 values and we have more than one persister / loader. It might be inconsequential in the end but that might be worth testing.
>> 
>> On a related note it's up for debate whether or not putting data in a hash map for faster lookup later is worth it in all cases:
>> 
>> - it takes much more space than raw arrays
>> - array scan might be as fast or faster for a small enough array. As we have seen in Infinispan and OGM, computing a hash is not a cheap operation.
>> 
>> Again this require testing but I am guilty as charge of using collections in AnnotationBinder when doing some computations that would be better off written as an array + array scan.
>> 
>> 
>> On 3 mai 2012, at 19:32, Steve Ebersole wrote:
>> 
>>> I seriously doubt the performance cost of 20 'parallel arrays' versus 1 array of Objects holding those 20 values is anything but negligible at best.




More information about the hibernate-dev mailing list