In-clause padding (see hibernate.query.in_clause_parameter_padding in https://docs.jboss.org/hibernate/orm/6.2/userguide/html_single/Hibernate_User_Guide.html#configurations-query) pads the number of bind variables used for an in-clause to the next power of 2 (1, 2, 4, 8, …). This exponential growth of bind variables makes for a good utilisation of the database execution plan cache for databases that have such a cache (such as Oracle-SQL). For databases that also have a limit to the number of in-clause parameters (see Dialect.getInExpressionCountLimit, this is 1000 for OracleDialect) the padding is only applied to the remainder of dividing the number of parameters by the limit. For example an in-clause with 4001 parameters is split into 5 in-clauses with 4*1000 and 1*1 bind variables, an in-clause with 4002 parameters is split into 5 in-clauses with 4*1000 and 1*2 bind variables and an in-clause with 4003 parameters is split into 5 in-clauses with 4*1000 and 1*4 bind variables. This results in 3 different queries, which means the execution plan cache can not be used. For databases that don’t have an in-clause limit all three examples would pad to the same query with 2^12=4096 bind variables and would use the same execution plan from cache. Instead of padding the remainder, a better approach would be to use a normal power of two padding up to the in-clause limit and then continue by padding to a power of two times the in-clause limit. For a limit of 1000 this would mean that 1001 parameters would result in a query with 2^1*1000=2000 bind variables and 4001 parameters would result in a query with 2^3*1000=8000 bind variables. In the current approach the number of bind variables grows linearly with the number of parameters which requires a linear amount of execution plan cache size, whereas in the proposed algorithm the number of bind variables grows exponentially with the number of parameters which requires only a logarithmic amount of execution plan cache size, just like when there no in-clause limit in play. The following table compares the number of bind variables used in the three approaches for an in-clause limit of 1000:
current |
proposal |
next power of two (no limit) |
1 |
1 |
1 |
2 |
2 |
2 |
4 |
4 |
4 |
8 |
8 |
8 |
16 |
16 |
16 |
32 |
32 |
32 |
64 |
64 |
64 |
128 |
128 |
128 |
256 |
256 |
256 |
512 |
512 |
512 |
1000 |
1000 |
1024 |
1001 |
2000 |
2048 |
1002 |
4000 |
4096 |
1004 |
8000 |
8192 |
1008 |
16000 |
16384 |
1016 |
32000 |
32768 |
1032 |
64000 |
65536 |
It is a bad idea to use a normal power of two padding for dialects with an in-clause limit, because this could cause unneccessary and/or-conjunctions for less than 2*limit parameters. For example 900 parameters should not be padded to 1024 bind variables, because the execution plan is likely better for a query with only one in-clause compared to a query with multiple in-clauses. The proposal avoids unnecessary conjunctions as much as possible while still only requiring a logarithmic amount of execution plan cache size. This approach was already implemented in https://github.com/hibernate/hibernate-orm/pull/6555, along with several bug fixes, but was seperated out into this issue as it is a change in behaviour. |