]
RH Bugzilla Integration updated ISPN-5705:
------------------------------------------
Bugzilla References:
Execution of a Hibrid Query that involves certain specific
'OR' conditions returns incorrect results
----------------------------------------------------------------------------------------------------
Key: ISPN-5705
URL:
https://issues.jboss.org/browse/ISPN-5705
Project: Infinispan
Issue Type: Bug
Components: Embedded Querying
Affects Versions: 8.0.0.Beta3
Environment: Hibrid Query that includes 'OR' condition fails in certain
specific cases
Reporter: Prashanth Reddy
Assignee: Adrian Nistor
Priority: Critical
Fix For: 8.0.0.Final
Attachments: BooleShannonExpansion.diff
The hibrid query, mentioned below produces wrong results, upon execution.
SELECT p.ID, p.NAME FROM com.testapp.Person p WHERE p.IS_ACTIVE=1 AND
(p.CITY='city1' OR p.CITY='city2') AND p.ID>1000
for the Data:
ID | NAME | CITY | IS_ACTIVE
-------------------
2001 person1 city1 1
2002 person1 city1 1
2003 person1 city1 1
2004 person1 city1 0
2005 person1 city2 1
2006 person1 city2 1
2007 person1 city3 0
2008 person1 city3 1
Indexed fields: ID, NAME, CITY
Non-indexed fields: IS_ACTIVE
Query execution returned 0 number of rows, whereas expected result count is 5.
After some analysis root cause for the problem has been found, and the details are as
follows,
As a part of separating the query that depends on indexed fields, using boole shannon
algorithm, it has first extraced out the condition 'IS_ACTIVE=1' (as it deals with
non-indexed fields).
Assuming four boolean conditions as c1, c2, c3, c4, where c1 represents the condition
'IS_ACTIVE=1'
f(c1,c2,c3,c4) = c1.f(1, c2, c3, c4) + c1`.f`(0, c2, c3, c4)
consider
e1=f(1, c2, c3, c4)
e2=f`(0, c2, c3, c4)
which have to be used for constructing the lucene query, for further transformation.
After splitting as per the above logic, it is trying to optimize the resultant
subconditions(e1, e2), so as to reduce the number of conditions to be evaluated. One such
optimization that has been found is that when there is a conjuction operation between two
comparison predicates dealing with same lvalues(here, same fields), but with the different
rvalues(here, constants), it has been optimized to replace it with a
contradiction(constant false boolean expression). As we know, this optimization is
applicable only for conjuction. But, the same is applied even for the disjuction, which is
causing the above mentioned problem.
For example,
condition p.CITY='city1' AND p.CITY='city2'
can be replaced with CONTRADICTION(BooleanConst.FALSE)
But, the same cannot be done, for
condition p.CITY='city1' OR p.CITY='city2'
Following change may correct the problem.
File:
infinispan-8.0.0.Beta3/object-filter/src/main/java/org/infinispan/objectfilter/impl/syntax/BooleShannonExpansion.java:155
diff:
@@ -152,7 +152,7 @@
}
}
}
- PredicateOptimisations.optimizePredicates(newChildren, true);
+ PredicateOptimisations.optimizePredicates(newChildren, false);
if (newChildren.size() == 1) {
return newChildren.get(0);
}