Query.getResultSize() to be available on the simplified DSL?
by Sanne Grinovero
Hi all,
we are exposing a nice feature inherited from the Search engine via
the "simple" DSL version, the one which is also available via Hot Rod:
org.infinispan.query.dsl.Query.getResultSize()
To be fair I hadn't noticed we do expose this, I just noticed after a
recent PR review and I found it surprising.
This method returns the size of the full resultset, disregarding
pagination options; you can imagine it fit for situations like:
"found 6 million matches, these are the top 20: "
A peculiarity of Hibernate Search is that the total number of matches
is extremely cheap to figure out as it's generally a side effect of
finding the 20 results. Essentially we're just exposing an int value
which was already computed: very cheap, and happens to be useful in
practice.
This is not the case with a SQL statement, in this case you'd have to
craft 2 different SQL statements, often incurring the cost of 2 round
trips to the database. So this getResultSize() is not available on the
Hibernate ORM Query, only on our FullTextQuery extension.
Now my doubt is if it is indeed a wise move to expose this method on
the simplified DSL. Of course some people might find it useful, still
I'm wondering how much we'll be swearing at needing to maintain this
feature vs its usefulness when we'll implement alternative execution
engines to run queries, not least on Map/Reduce based filtering, and
ultimately hybrid strategies.
In case of Map/Reduce I think we'll need to keep track of possible
de-duplication of results, in case of a Teiid integration it might
need a second expensive query; so in this case I'd expect this method
to be lazily evaluated.
Should we rather remove this functionality?
Sanne
10 years, 9 months
Update on the testsuite state
by Sanne Grinovero
Results :
Failed tests:
NotifyingFutureTest.testExceptionOtherThread2:51->testExceptionOtherThread:68->testException:151
expected [true] but found [false]
VersionedDistStateTransferTest.testStateTransfer:96->MultipleCacheManagersTest.waitForClusterToForm:232->MultipleCacheManagersTest.waitForClusterToForm:225
» IllegalState
Tests run: 4233, Failures: 2, Errors: 0, Skipped: 0
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Infinispan BOM .................................... SUCCESS [0.135s]
[INFO] Infinispan Common Parent .......................... SUCCESS [1.700s]
[INFO] Infinispan Checkstyle Rules ....................... SUCCESS [2.395s]
[INFO] Infinispan Commons ................................ SUCCESS [5.411s]
[INFO] Infinispan Core ................................... FAILURE [9:51.344s]
Pretty good, but no jackpot yet!
I'll try again next week?
10 years, 9 months
Query.getResultSize() to be available on the simplified DSL?
by Sanne Grinovero
Hi all,
we are exposing a nice feature inherited from the Search engine via
the "simple" DSL version, the one which is also available via Hot Rod:
org.infinispan.query.dsl.Query.getResultSize()
To be fair I hadn't noticed we do expose this, I just noticed after a
recent PR review and I found it surprising.
This method returns the size of the full resultset, disregarding
pagination options; you can imagine it fit for situations like:
"found 6 million matches, these are the top 20: "
A peculiarity of Hibernate Search is that the total number of matches
is extremely cheap to figure out as it's generally a side effect of
finding the 20 results. Essentially we're just exposing an int value
which was already computed: very cheap, and happens to be useful in
practice.
This is not the case with a SQL statement, in this case you'd have to
craft 2 different SQL statements, often incurring the cost of 2 round
trips to the database. So this getResultSize() is not available on the
Hibernate ORM Query, only on our FullTextQuery extension.
Now my doubt is if it is indeed a wise move to expose this method on
the simplified DSL. Of course some people might find it useful, still
I'm wondering how much we'll be swearing at needing to maintain this
feature vs its usefulness when we'll implement alternative execution
engines to run queries, not least on Map/Reduce based filtering, and
ultimately hybrid strategies.
In case of Map/Reduce I think we'll need to keep track of possible
de-duplication of results, in case of a Teiid integration it might
need a second expensive query; so in this case I'd expect this method
to be lazily evaluated.
Should we rather remove this functionality?
Sanne
10 years, 9 months
Re: [infinispan-dev] Permission to list you as our contact
by Galder Zamarreño
Hi Rory,
On 07 Mar 2014, at 10:06, Rory O'Donnell Oracle, Dublin Ireland <rory.odonnell(a)oracle.com> wrote:
> Hi Galder,
>
> The Adopt OpenJDK Group are promoting the testing of FOSS projects with OpenJDK builds,
> whether their own, or from someone else. We want to acknowledge projects who are actively
> testing, providing feedback and any issues they have found during their testing etc.
>
> A draft of the page is now available here
>
> Is it ok to add your name as our contact , is there a mailing list I should copy ?
Yeah, not a problem to add my name/contact.
Mailing list wise, you can use infinispan-dev(a)lists.jboss.org, which is the development list of the Infinispan project for which I work for.
Cheers,
>
> Rgds,Rory
> --
> Rgds,Rory O'Donnell
> Quality Engineering Manager
> Oracle EMEA , Dublin, Ireland
>
--
Galder Zamarreño
galder(a)redhat.com
twitter.com/galderz
Project Lead, Escalante
http://escalante.io
Engineer, Infinispan
http://infinispan.org
10 years, 9 months
Problem with HotRod cache updates
by Mark Kowaliszyn
Hi,
I am using the RemoteCacheManager to access a cache on my cluster. Getting an entry and updating it on the client works no problem, however on the server, the cache receives an entry with a byte array cache key, rather than the original string I put.
My server results in the following listener output when the cache put occurs:
DEBUG 0305-16:49:16:789 Cache (thermostatCache) entry modified: [3, 62, 4, 49, 48, 48, 48] (local=true) {foundation.infinispan.listener.CacheLoggingListener.entryModified} [HotRod-HotRodServerServerWorker-19]({})
DEBUG 0305-16:49:16:794 ++++ string: [B@690edaf3, new string: >1000 {foundation.infinispan.listener.CacheLoggingListener.entryModified} [HotRod-HotRodServerServerWorker-19]({})
The cache key in question is "1000". The output above is from a cache listener and the output is from CacheModifiedEvent.getKey(). I have some additional output to first do a toString() on the key, and a new string decoding the byte array. There are a few bytes prefixing the byte array which are not part of the cache key. In the cut/paste here, there are 2 characters missing, the "new string" has 2 unprintable character one before and one after the ">" character.
What are the extra bytes in key? Why is the key inserted as bytes and not a string?
The end effect is that my cluster cache gets a new junk entry in the cache with every client put. I did not see any documentation where it indicated I might need a custom key serializer. I am using strings for cache keys, nothing special.
Updating the cache from the cluster-local cache works perfectly.
Thanks,
Mark
10 years, 9 months
grouping and GridFS
by Ales Justin
Just having a discussion with Bela about this.
I guess having "grouping" on GridFS' content would make sense.
e.g. put all chunks on the same node
Is this doable?
Afaiu, we would need to have some sort of "similarity" function for content's metadata?
-Ales
10 years, 9 months