[infinispan-dev] range query in Infinispan !!

Sanne Grinovero sanne at infinispan.org
Mon May 14 13:01:16 EDT 2012


On 14 May 2012 17:50, Tristan Tarrant <ttarrant at redhat.com> wrote:
> Sanne,
>
> Prabhat and I had a chat the other day and agreed that Infinispan would
> be much easier to apply to a large variety of use cases if we could
> iterate over ordered keys given a starting point. Cassandra does this by
> partitioning (grouping in Infinispan speak) consecutive keys on the same
> node for performance reasons. I guess you could use the Query module and
> distributed keys for this, but I think it is a bit overkill.

I agree, Query would be overkill. I'm the first one to *not* recommend
using Query if you also need good write performance ;)
I would also agree that being unable to iterate the keys is often a
though problem;

but what's this concept of "key order" you're mentioning ??
The complexity of such a patch would be close to "rewrite Infinispan"
!? No actually that would be simpler since we likely learned a bit
from the first time :D

I had drafted some ideas about a similar design while in St Louis with
Emmanuel and Hardy about a truly partitioned index, with index
segments and values located in the same containers; if someone is
serious about it I could share the design but I'd estimate some ~6
months full time work on it, including an ad-hoc B-tree in memory to
replace traditional Lucene indexes. And we had some Whiskey, so maybe
it won't work.

Cheers,
Sanne

>
> Tristan
>
> On 05/14/2012 06:40 PM, Prabhat Jha wrote:
>> I have not used Infinispan's Query or Map/Reduce functionalities yet
>> because of them not being in JDG yet. Yes, we can use those to get what
>> I have mentioned. Query should be more straight forward and simpler than
>> M/R I think. But Query has dependency on Lucene and I have experienced
>> great pain in the past when using Lucene and FileSystem for shared storage.
>>
>> My perspective is a bit different. I am arguing for a "simpler" solution
>> for a problem that I find to be very common. Similar to how in
>> Cassandra, you can easily query based on a time range and the order you
>> want.
>>
>> On 05/14/2012 11:05 AM, Sanne Grinovero wrote:
>>> why "not using Query" ?
>>>
>>> Such features are available in core using Map/Reduce; I don't think
>>> that different approaches should be provided by core otherwise, there
>>> is enough complexity in there...
>>>
>>> Cheers,
>>> Sanne
>>>
>>> On 14 May 2012 16:58, Prabhat Jha<pjha at redhat.com>   wrote:
>>>> Hi,
>>>>
>>>> In QuickTweet we needed a way to get most recent x tweets for a user or
>>>> on a topic.  Currently we are implementing it by  keeping entries in the
>>>> cache and updating a bounded FIFO queue in parallel. However, to get
>>>> most recent data or data for a given time range is a very common use
>>>> case specially in social media applications. It would be good to see
>>>> this range feature available in out of box (not using Query) in upcoming
>>>> Infinispan releases.  Thoughts?
>>>>
>>>> I can get it started by creating a Jira unless I hear otherwise.
>>>>
>>>> Regards,
>>>> Prabhat
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev



More information about the infinispan-dev mailing list