[hibernate-dev] HHH-11042 Count distinct tuples
Christian Beikov
christian.beikov at gmail.com
Sat Sep 3 03:48:05 EDT 2016
Hey Gail,
I guess we have to introduce a configuration parameter then or extend
the dialect with a method for these cases?
Should I implement a POC? If so, which approach should I use and how
would I access the parameter if I should do it like that?
Regards,
Christian
Am 01.09.2016 um 23:13 schrieb Gail Badner:
> Hi Christian,
>
> I've been following your discussion on HHH-11042 and thinking about
> your solution. I am concerned that some dialects do not treat null the
> way you describe. SQL Server has a property that changes how null is
> treated:
> SET ANSI_NULLS { ON | OFF } [1]
> [1] mentions that in a future release it will not be possible to set
> to OFF, but for now, this is a concern. I also found [2] which says:
> "For the DISTINCT keyword, null values are considered to be duplicates
> of each other. When DISTINCT is included in a SELECT statement, only
> one NULL is returned in the results, regardless of how many null
> values are encountered."
> Sybase may be similar. I'm not sure if there are other dialects that
> could be affected.
> I agree that the dialect should be able to override the behavior.
> Regards,
> Gail
> [1] https://msdn.microsoft.com/en-us/library/ms188048.aspx [2]
> https://technet.microsoft.com/en-us/library/ms187831(v=sql.105).aspx
> <https://technet.microsoft.com/en-us/library/ms187831%28v=sql.105%29.aspx>
>
>
> On Thu, Sep 1, 2016 at 3:18 AM, Christian Beikov
> <christian.beikov at gmail.com <mailto:christian.beikov at gmail.com>> wrote:
>
> Hey,
>
> I wanted to start a discussion regarding this issue:
> https://hibernate.atlassian.net/browse/HHH-11042
> <https://hibernate.atlassian.net/browse/HHH-11042>
>
> Although the Dialect class contains the method
> "supportsTupleDistinctCount", it is never used, so when doing a count
> distinct on a tuple, it just renders the tuple instead of doing a
> fallback or throwing an error.
>
> I suggested the OP to override the count function in the dialect to do
> whatever he thinks is best but then I realized that the count function
> is not even used as the logic is hard coded in some locations. The
> problematic location in this case is
> "org.hibernate.hql.internal.ast.tree.IdentNode.resolveAsAlias" which
> does not consider the function at all but renders the SQL directly.
>
> After suggesting him to introduce a custom function instead and some
> discussion on how count distinct could be reliably implemented I
> think I
> found a solution that might work for most databases.
>
> On stackoverflow and other sites it is often suggested to use a
> checksum
> to workaround this limitation which obviously is not a good idea. I
> proposed to do concatenation with a separator that doesn't appear
> in the
> string and apparently the character '\0' is a valid character which
> makes it a good candidate as that should normally not appear in a
> string.
>
> The final solution to the problem looks something like the following
>
> count(distinct case when col1 is null or col2 is null then null else
> col1 || '\0' || col2 end) + count(case when col1 is null or col2
> is null
> then 1 end)
>
> The first count does a count distinct on all columns concatenated with
> '\0' where all values are not null. The second just counts the cases
> where one of the column values was null. Together that emits the
> proper
> count based on the assumption that '\0' does not appear in the
> columns.
>
> What do you think about that solution? I would like to implement
> it that
> way and do a PR.
>
> I would also like to make use of the count function registered in the
> dialect to make this overrideable. Hope that's okay?
>
>
> Regards,
> Christian
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org <mailto:hibernate-dev at lists.jboss.org>
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> <https://lists.jboss.org/mailman/listinfo/hibernate-dev>
>
>
More information about the hibernate-dev
mailing list