Ok you make some good points, and I've no doubts of it being useful.
My only concern is that this could slow us down significantly in
providing other features which might be even more useful or pressing.
You have to pick your battles and be wise on where to spend energy
first.
Considering that it's easier to add methods than to remove them, what
would you think of marking this as experimental for now?
I'd prefer to see the non-indexed query engine delivered first; this
sounds like being a stone on the critical path so it might be wise to
have the option to drop the requirement from a first implementation.
Definitely you're right that we should then implement "some" COUNT
strategy, I'm just not comfortable in committing on this one yet.
Now on a general purpose COUNT: for sure we need one but it's a
pandora's box you're opening. In a sense there is a parallelism
conceptually with my concerns on the API contract we provide for the
clear() method. too keep it short in this context as we're changing
subject, I don't think we'll ever be able to provide a solid guarantee
of a fully reliable value: indexes are not updated in transaction yet,
and M/R does cross boundaries of nodes and datacontainer/cachestore
without making a consistent read snapshot. We should document any such
API as to providing a best effort estimate.
On 10 March 2014 13:16, Adrian Nistor <anistor(a)redhat.com> wrote:
I'd vote for keeping it, and executing it lazily in environments
where it is
costly to compute it upfront.
And off course, document this properly so users will be aware it can incur a
second execution, with significant performance impact and also possibly a
data visibility/consistency impact. I'd do this because the api is meant to
be first of all user friendly and useful, not just machine friendly and
efficient.
There's another reason for having it. Say we remove it, how will users be
able to know the total number of matching results? Our DSL does not
currently have a 'count' function. Maybe we should add such a thing first,
and then think about removing Query.getResultsSize().
But, if we implement a proper 'count', getResultsSize() could be trivially
implemented as some kind of syntactic sugar on top of it, so I would still
consider it worth being in the API.
And then it all boils down to the question: should the DSL provide a count
function? (+1 from me)
Cheers
On 03/10/2014 02:23 PM, Sanne Grinovero wrote:
Hi all,
we are exposing a nice feature inherited from the Search engine via
the "simple" DSL version, the one which is also available via Hot Rod:
org.infinispan.query.dsl.
Query.getResultSize()
To be fair I hadn't noticed we do expose this, I just noticed after a
recent PR review and I found it surprising.
This method returns the size of the full resultset, disregarding
pagination options; you can imagine it fit for situations like:
"found 6 million matches, these are the top 20: "
A peculiarity of Hibernate Search is that the total number of matches
is extremely cheap to figure out as it's generally a side effect of
finding the 20 results. Essentially we're just exposing an int value
which was already computed: very cheap, and happens to be useful in
practice.
This is not the case with a SQL statement, in this case you'd have to
craft 2 different SQL statements, often incurring the cost of 2 round
trips to the database. So this getResultSize() is not available on the
Hibernate ORM Query, only on our FullTextQuery extension.
Now my doubt is if it is indeed a wise move to expose this method on
the simplified DSL. Of course some people might find it useful, still
I'm wondering how much we'll be swearing at needing to maintain this
feature vs its usefulness when we'll implement alternative execution
engines to run queries, not least on Map/Reduce based filtering, and
ultimately hybrid strategies.
In case of Map/Reduce I think we'll need to keep track of possible
de-duplication of results, in case of a Teiid integration it might
need a second expensive query; so in this case I'd expect this method
to be lazily evaluated.
Should we rather remove this functionality?
Sanne
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev