Yoann Rodière (
https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%...
) *updated* an issue
Hibernate Search (
https://hibernate.atlassian.net/browse/HSEARCH?atlOrigin=eyJpIjoiNzljMzU4...
) / Improvement (
https://hibernate.atlassian.net/browse/HSEARCH-3856?atlOrigin=eyJpIjoiNzl...
) HSEARCH-3856 (
https://hibernate.atlassian.net/browse/HSEARCH-3856?atlOrigin=eyJpIjoiNzl...
) Aggregations on multi-valued numeric fields for Lucene (
https://hibernate.atlassian.net/browse/HSEARCH-3856?atlOrigin=eyJpIjoiNzl...
)
Change By: Yoann Rodière (
https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%...
)
See how {{
org.hibernate.search.integrationtest.backend.tck.search.aggregation.SingleFieldAggregationBaseIT#multiValued
}} is disabled due to {{
org.hibernate.search.integrationtest.backend.lucene.testsupport.util.LuceneTckBackendFeatures#aggregationsOnMultiValuedFields
}} .
Before HSEARCH-3839, we couldn't even index multiple values for numeric fields in
Lucene. After HSEARCH-3839, we can, but we pick a single value when aggregating, so
aggregations are still incorrect.
Ideally, when counting documents per field value, multi-valued documents should be counted
once per value that appears in the field. So if a single document has values {{ 1 }} and
{{ 2 }} for a single field, it should increment the count for both {{ 1 }} and {{ 2
}} . At least that's what happens on Elasticsearch.
How to test the behavior on Elasticsearch:
{code}
curl -XDELETE -H "Content-Type: application/json" localhost:9200/mytest1/
1>&2 2>/dev/null; curl -XPUT -H "Content-Type: application/json"
localhost:9200/mytest1/\?pretty
-d'{"mappings":{"properties":{"num":{"type":"integer"
}}} }'
url curl -XPUT -H "Content-Type: application/json" localhost:9200/mytest1/_doc/1
-d'{"num":1}'
curl -XPUT -H "Content-Type: application/json" localhost:9200/mytest1/_doc/2
-d'{"num":[1,2]}'
curl -XPOST -H "Content-Type: application/json"
localhost:9200/mytest1/_search\?pretty
-d'{"aggs":{"foo":{"terms":{"field":"num"
}}} }'
{code}
Result:
{noformat}
{
...
"aggregations" : {
"foo" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 1,
"doc_count" : 2
},
{
"key" : 2,
"doc_count" : 1
}
]
}
}
}
{noformat}
So document 2 was counted twice.
(
https://hibernate.atlassian.net/browse/HSEARCH-3856#add-comment?atlOrigin...
) Add Comment (
https://hibernate.atlassian.net/browse/HSEARCH-3856#add-comment?atlOrigin...
)
Get Jira notifications on your phone! Download the Jira Cloud app for Android (
https://play.google.com/store/apps/details?id=com.atlassian.android.jira....
) or iOS (
https://itunes.apple.com/app/apple-store/id1006972087?pt=696495&ct=Em...
) This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100121- sha1:b4d24b6 )