[hibernate-dev] Memory consumption

Steve Ebersole steve at hibernate.org
Thu May 17 15:24:26 EDT 2012



On 05/17/2012 11:53 AM, Andrej Golovnin wrote:
> Hi Steve,
>
>> Deferring initialization of these batch loaders is one option, but that just really delays the memory consumption.  Personally, if it were my app I'd rather that sizing be done up front rather than over the course of the next few hours or weeks as the system continues to run.
>
> The question is, if you really need all LockModes for an entity. I personally doubt it.

Well the more appropriate question is how Hibernate can possibly know 
that up front.  And I already stated why I think initializing these 
lazily is a bad idea.


>>
>> Its been proposed a number of times to instead generate just a single loader for loading that number of items.  This would mean generating a single SQL statement that has parameter holders for the full batch size and then somehow "filling out" the empty slots when the number of things to load is less that 16.  If this could be made workable across all dialects, I personally think it would be the best approach as it probably gets close to the start up sizings you claimed but would never grow beyond that.  But we would need to discuss this to see if it is a real possibility.  Lets work an example with batch-size = 3.  That would produce SQL like "select a.x, a.y, a.z from a a where a.x = ? or a.x = ? or a.x = ?"
>
> I think you mean "select a.x, a.y, a.z from a a where a.x in (?, ?, ?)".
> At least this is what Hibernate generates now.
> The first rule I learned in SQL: avoid "OR" as much as possible
> in WHERE clause. Query optimizers produce better query execution
> plans for "IN" as for equivalent "OR"-statements.

I am just talking the general approach, not specifics of whether OR or 
IN gets used.

Yes many optimizers are not smart enough to see that each OR branch is 
predicated on the same values...


> The only difference may be in the costs for the query calculated by the optimizer.
> But the query execution time should be the same.
>
> Example from PostgreSQL:
>
> Table:
> CREATE TABLE test (
>    oid integer NOT NULL,
>    ttt character varying(10),
>    CONSTRAINT test_pkey PRIMARY KEY (oid )
> );
>
> Number of rows: 100000
>
> Query: SELECT * FROM test WHERE oid IN (100, 101, 102);
> Explain output:
> "Index Scan using test_pkey on public.test  (cost=0.00..16.91 rows=3 width=9) (actual time=0.025..0.035 rows=3 loops=1)"
> "  Output: oid, ttt"
> "  Index Cond: (test.oid = ANY ('{100,101,102}'::integer[]))"
> "  Buffers: shared hit=7"
>
> Query: SELECT * FROM test WHERE oid IN (100, 101, 102, 100, 100, 100);
> "Index Scan using test_pkey on public.test  (cost=0.00..29.72 rows=6 width=9) (actual time=0.024..0.034 rows=3 loops=1)"
> "  Output: oid, ttt"
> "  Index Cond: (test.oid = ANY ('{100,101,102,100,100,100}'::integer[]))"
> "  Buffers: shared hit=7"
>
> Explain output:
> Query: SELECT * FROM test WHERE oid IN (100, 101, 102, null, null, null);
> "Index Scan using test_pkey on public.test  (cost=0.00..29.69 rows=3 width=9) (actual time=0.023..0.033 rows=3 loops=1)"
> "  Output: oid, ttt"
> "  Index Cond: (test.oid = ANY ('{100,101,102,NULL,NULL,NULL}'::integer[]))"
> "  Buffers: shared hit=7"
>
> As you see the difference is only in the calculated costs.

Well (1) is exactly what Hibernate does today.  The question here is a 
trade off between the in-memory profile of maintaining all those loaders 
versus the anticipated execution costs.  Thats really what it comes down to.


-- 
steve at hibernate.org
http://hibernate.org


More information about the hibernate-dev mailing list