[infinispan-dev] Design change in Infinispan Query

Sanne Grinovero sanne at infinispan.org
Fri Mar 7 10:51:30 EST 2014


On 7 March 2014 15:27, Mircea Markus <mmarkus at redhat.com> wrote:
>
> On Mar 7, 2014, at 3:21 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>
>> On 7 March 2014 14:54, Mircea Markus <mmarkus at redhat.com> wrote:
>>>
>>> On Mar 6, 2014, at 9:21 AM, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>>>
>>>> On Wed 2014-03-05 17:16, Mircea Markus wrote:
>>>>> Sanne came with a good follow up to this email, just some small clarifications:
>>>>>
>>>>> On Mar 4, 2014, at 6:02 PM, Emmanuel Bernard <emmanuel at hibernate.org> wrote:
>>>>>
>>>>>>>> If you have to do a map reduce for tasks so simple as age > 18, I think you system better have to be prepared to run gazillions of M/R jobs.
>>>>>>>
>>>>>>> I want to run a simple M/R job in the evening to determine who turns 18 tomorrow, to congratulate them. Once a day, not gazzilions of times, and I don't need to index the age filed just for that. Also when it comes to Map/Reduce, the drawback of holding all the data in a single cache is two-folded:
>>>>>>> - performance: you iterate over the data that is not related to your query.
>>>>>>
>>>>>> If the data are never related (query wise), then we are in the database split category. Which is fine. But if some of your queries are related, what do you do? Deny the user the ability to do them?
>>>>>
>>>>> Here's where cross-site query would have been used. As Sanne suggested (next post) these limitations overcome the advantages.
>>>>
>>>> No. Cross-cache query if implemented will not support (efficiently
>>>> enough) that kind of query. Cf my wiki page.
>>>
>>> yes, non-indexed joins would be exponential on the number of caches involved.
>>
>> Technically non-indexed joins would be exponential on the number of
>> caches (joins) involves *and* on the amount of entries you have
>> stored: I know you wheren't suggesting doing it, but to confirm it's
>> even worse than an horrible idea ;-)
>> And that's not even considering the subtle design catch of "load it
>> all from all cachestores".. combined with "multiple times per join"..
>
> I wasn't suggesting doing it, not only for performance but also for the limitations you mentioned in the previous emails.
>
>>
>>> Is it possible to use an index for x-cache joins with linear index update time and query?
>>
>> Index update cost is not linear but LogN: approximates to a constant
>> cost.
>
> you're counting RPCs here or index seeks?

RPCs are constant, and independent from both the query type and the
data size. For a local (or distributed) index there are zero RPCs, for
DIST it depends on a factor of total index size, chunking and merging
options, numowners, etc.. but these are fixed once defined - >
constant number of RPCs.

The count on index seeks do depend on the query type only, not on the
size at all.

I'm referring to the approximate computation cost of each index seek.


>
>> And we could cut this constant by 4 orders of magnitude if only
>> I could safely differentiate between a put of a new entry vs. an
>> update -> something which we'll need to brainstorm about.
>>
>> Query time is also significantly sub-linear in practice, but specifics
>> will vary on the query type.
>>
>> Yes you could use indexes to improve x-cache joins, but you'll need an
>> additional engine to coordinate that correctly, not least to manage
>> data size buffers; essentially I think you'd need Teiid.
>>
>> Sanne
>>
>>
>>>
>>> Cheers,
>>> --
>>> Mircea Markus
>>> Infinispan lead (www.infinispan.org)
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> Cheers,
> --
> Mircea Markus
> Infinispan lead (www.infinispan.org)
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev



More information about the infinispan-dev mailing list