[hibernate-dev] Hibernate Developer IRC meeting - 5/03

Steve Ebersole steve at hibernate.org
Fri May 4 08:14:35 EDT 2012


Not this one :(

This library argument is just not relevant to how persisters store 
internal state, which is really all I am talking about.  And in terms of 
the SPI, if we opt to expose this change, is happening in a major rev; 
so again not relevant.

As for perf issues OGM uncovered in ORM... First, none of those involved 
anything close to the discussion here; they dealt mostly with small 
things (like hashCode caching) that became big because of size of scale. 
  Second, thats actually exactly how you should discover and treat perf 
issues.


On 05/04/2012 06:41 AM, Emmanuel Bernard wrote:
> Yes, but we can't blindly go for the nicer approach: we are a library and hence used by many.
> Look at the bottleneck we found on Hibernate ORM due to the change of magnitude Hibernate OGM involved. Look at the perf issues we found in OGM itself just because I used the builder pattern for some objects using in critical paths. Same for hashCode that was not cached in some critical objects.
>
> I'm all fine with the nice approach if it's followed by a perf test before being pushed :)
>
> On 4 mai 2012, at 12:39, Hardy Ferentschik wrote:
>
>> Even taking the risk of pouring oil onto the fire, I think a simpler data structure wins in most cases over
>> the parallel arrays. It is much harder to use the latter and easier to make mistakes which leads to more
>> bugs and higher maintenance costs.
>>
>> As Sanne is saying performance questions are tricky. So many thing are happening with the code our days
>> before they are getting executed on the bare metal that it is hard to know what performance impacts a certain
>> change has. In the end you just have to measure.
>>
>> Personally I think we should primarily strive for a better and easier to use API. Oppertunities to optimizes arise
>> then often naturally.
>>
>> And now my dear disciples let me close with:
>> "The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet." — Michael A. Jackson
>>
>> :-)
>>
>> --Hardy
>>
>> On May 4, 2012, at 11:58 AM, Sanne Grinovero wrote:
>>
>>> tricky subject.
>>> I'm confident that there are many cases in which we should have used
>>> arrays rather than maps, especially for temporary objects which aren't
>>> short lived enough (an HashMap living in the scope of a single method
>>> is going to be cheap). We should have either objects allocated for
>>> very long (like forever in the scope of the SessionFactory), or very
>>> short.
>>>
>>> In the case of how we keep metadata, I think performance would be
>>> dominated not that much by the fact it's a slightly bigger object but
>>> by prefetching and what is going to be available in the cache lines
>>> you just have filled in: obviously cache is way faster than memory so
>>> being clever in the sequence you lay out your data structure could
>>> speed you up by a couple of orders of magnitude.
>>>
>>> Using primitives and array matrixes makes the data smaller, hence more
>>> likely to fit in the cache; but if using an array of objects in which
>>> each object collects the needed fields in one group, that's likely
>>> going to be faster.. but I'm making assumptions on how this structure
>>> is going to be read more frequently.
>>>
>>> For example when declaring a matrix as an [ ][ ], performance will be
>>> very different depending if you read by columns or rows - forgot which
>>> one is better now - but in that case if the common use case is using
>>> the slower path it's usually a good idea to invert the matrix.
>>>
>>> I'd love it if we could enter this space, or even if it's not suited
>>> for it, at least be considered "lite":
>>> http://stackoverflow.com/questions/10374735/lucene-and-ormlite-on-android
>>>
>>> Sanne
>>>
>>> On 4 May 2012 10:07, Emmanuel Bernard<emmanuel at hibernate.org>  wrote:
>>>> Performance I don't know, you are probably right. But memory wise, that could be way different.
>>>> Even ignoring the overhead of the object + pointer in memory, the alignment of boolean or other small objects would make a significant impact.
>>>>
>>>> Of course if we are talking about 20 values, we should not bother. But persisters and the like store more than 20 values and we have more than one persister / loader. It might be inconsequential in the end but that might be worth testing.
>>>>
>>>> On a related note it's up for debate whether or not putting data in a hash map for faster lookup later is worth it in all cases:
>>>>
>>>> - it takes much more space than raw arrays
>>>> - array scan might be as fast or faster for a small enough array. As we have seen in Infinispan and OGM, computing a hash is not a cheap operation.
>>>>
>>>> Again this require testing but I am guilty as charge of using collections in AnnotationBinder when doing some computations that would be better off written as an array + array scan.
>>>>
>>>>
>>>> On 3 mai 2012, at 19:32, Steve Ebersole wrote:
>>>>
>>>>> I seriously doubt the performance cost of 20 'parallel arrays' versus 1 array of Objects holding those 20 values is anything but negligible at best.
>>
>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev

-- 
steve at hibernate.org
http://hibernate.org


More information about the hibernate-dev mailing list