Re: [infinispan-dev] Design change in Infinispan Query

Monday, 17 February 2014

On 30 Jan 2014, at 20:51, Mircea Markus <mmarkus(a)redhat.com&gt; wrote:

...

 On Jan 30, 2014, at 9:42 AM, Galder Zamarreño <galder(a)redhat.com&gt; wrote:

> 
> On Jan 21, 2014, at 11:52 PM, Mircea Markus <mmarkus(a)redhat.com&gt; wrote:
> 
>> 
>> On Jan 15, 2014, at 1:42 PM, Emmanuel Bernard <emmanuel(a)hibernate.org&gt;
wrote:
>> 
>>> By the way, people looking for that feature are also asking for a unified
Cache API accessing these several caches right? Otherwise I am not fully understanding why
they ask for a unified query.
>>> Do you have written detailed use cases somewhere for me to better understand
what is really requested?
>> 
>> IMO from a user perspective, being able to run queries spreading several caches
makes the programming simplifies the programming model: each cache corresponding to a
single entity type, with potentially different configuration.
> 
> Not sure if it simplifies things TBH if the configuration is the same. IMO, it just
adds clutter.

 Not sure I follow: having a cache that contains both Cars and Persons sound more
cluttering to me. I think it's cumbersome to write any kind of querying with an
heterogenous  cache, e.g. Map/Reduce tasks that need to count all the green Cars would
need to be aware of Persons and ignore them. Not only it is harder to write, but
discourages code reuse and makes it hard to maintain (if you'll add Pets in the same
cache in future you need to update the M/R code as well). And of course there are also
different cache-based configuration options that are not immediately obvious (at design
time) but will be in the future (there are more Persons than Cars, they live longer/expiry
etc): mixing everything together in the same cache from the begging is a design decision
that might bite you in the future.

 The way I see it - and very curious to see your opinion on this - following an database
analogy, the CacheManager corresponds to an Database and the Cache to a Table. Hence my
thought that queries spreading multiple caches are both useful and needed (same as query
spreading over multiple tables). 
My opinion is that seeing it this way is limiting. A key/value store is schemaless. Your
view is forcing a particular schema on how to structure things. 

I don’t pretend everyone to store everything in a single cache and of course there will be
situations where it’s not ideal or the best solution, such as in cases like the ones you
mention above, but if you want to do it, for any of the reasons I or Paul mentioned in
[1], it’d be nice to be able to do so. 

Cheers,

[1] https://issues.jboss.org/browse/ISPN-3640

...

> 
> Just yesterday I discovered this gem in Scala's Shapeless extensions [1]. This is
experimental stuff but essentially it allows to define what the key/value type pairs a map
will contain, and it does type checking at compile time. I almost wet my pants when I saw
that ;) :p. In the example, it defines a map as containing Int -> String, and String
-> Int key/value pairs. If you try to add an Int -> Int, it fails compilation.

 Agreed the compile time check is pretty awesome :-) Still mix and matching types in a Map
doesn't look great to me for ISPN.

> 
> Java's type checking is not powerful enough to do this, and it's compilation
logic is not extendable in the same way Scala macros does, but I think the fact that other
languages are looking into this validates Paul's suggestion in [2], on top of all the
benefits listed there.
> 
> Cheers,
> 
> [1]
https://github.com/milessabin/shapeless/wiki/Feature-overview:-shapeless-...
> [2] https://issues.jboss.org/browse/ISPN-3640
> 
>> Besides the query API that would need to be extended to support accessing
multiple caches, not sure what other APIs would need to be extended to take advantage of
this?
>> 
>>> 
>>> Emmanuel
>>> 
>>> On 14 Jan 2014, at 12:59, Sanne Grinovero <sanne(a)infinispan.org&gt;
wrote:
>>> 
>>>> Up this: it was proposed again today ad a face to face meeting.
>>>> Apparently multiple parties have been asking to be able to run
>>>> cross-cache queries.
>>>> 
>>>> Sanne
>>>> 
>>>> On 11 April 2012 12:47, Emmanuel Bernard <emmanuel(a)hibernate.org&gt;
wrote:
>>>>> 
>>>>> On 10 avr. 2012, at 19:10, Sanne Grinovero wrote:
>>>>> 
>>>>>> Hello all,
>>>>>> currently Infinispan Query is an interceptor registering on the
>>>>>> specific Cache instance which has indexing enabled; one such
>>>>>> interceptor is doing all what it needs to do in the sole scope of
the
>>>>>> cache it was registered in.
>>>>>> 
>>>>>> If you enable indexing - for example - on 3 different caches,
there
>>>>>> will be 3 different Hibernate Search engines started in
background,
>>>>>> and they are all unaware of each other.
>>>>>> 
>>>>>> After some design discussions with Ales for CapeDwarf, but also
>>>>>> calling attention on something that bothered me since some time,
I'd
>>>>>> evaluate the option to have a single Hibernate Search Engine
>>>>>> registered in the CacheManager, and have it shared across
indexed
>>>>>> caches.
>>>>>> 
>>>>>> Current design limitations:
>>>>>> 
>>>>>> A- If they are all configured to use the same base directory to
>>>>>> store indexes, and happen to have same-named indexes, they'll
share
>>>>>> the index without being aware of each other. This is going to
break
>>>>>> unless the user configures some tricky parameters, and even so
>>>>>> performance won't be great: instances will lock each other
out, or at
>>>>>> best write in alternate turns.
>>>>>> B- The search engine isn't particularly "heavy",
still it would be
>>>>>> nice to share some components and internal services.
>>>>>> C- Configuration details which need some care - like injecting a
>>>>>> JGroups channel for clustering - needs to be done right isolating
each
>>>>>> instance (so large parts of configuration would be quite similar
but
>>>>>> not totally equal)
>>>>>> D- Incoming messages into a JGroups Receiver need to be routed
not
>>>>>> only among indexes, but also among Engine instances. This
prevents
>>>>>> Query to reuse code from Hibernate Search.
>>>>>> 
>>>>>> Problems with a unified Hibernate Search Engine:
>>>>>> 
>>>>>> 1#- Isolation of types / indexes. If the same indexed class is
>>>>>> stored in different (indexed) caches, they'll share the same
index. Is
>>>>>> it a problem? I'm tempted to consider this a good thing, but
wonder if
>>>>>> it would surprise some users. Would you expect that?
>>>>> 
>>>>> I would not expect that. Unicity in Hibernate Search is not defined
per identity but per class + provided id.
>>>>> I can see people reusing the same class as partial DTO and willing to
index that. I can even see people
>>>>> using the Hibernate Search programmatic API to index the
"DTO" stored in cache 2 differently than the
>>>>> domain class stored in cache 1.
>>>>> I can concede that I am pushing a bit the use case towards bad-ish
design approaches.
>>>>> 
>>>>>> 2#- configuration format overhaul: indexing options won't be
set on
>>>>>> the cache section but in the global section. I'm looking
forward to
>>>>>> use the schema extensions anyway to provide a better
configuration
>>>>>> experience than the current <properties />.
>>>>>> 3#- Assuming 1# is fine, when a search hit is found I'd need
to be
>>>>>> able to figure out from which cache the value should be loaded.
>>>>>> 3#A  we could have the cache name encoded in the index, as part
>>>>>> of the identifier: {PK,cacheName}
>>>>>> 3#B  we actually shard the index, keeping a physically separate
>>>>>> index per cache. This would mean searching on the joint index
view but
>>>>>> extracting hits from specific indexes to keep track of
"which index"..
>>>>>> I think we can do that but it's definitely tricky.
>>>>>> 
>>>>>> It's likely easier to keep indexed values from different
caches in
>>>>>> different indexes. that would mean to reject #1 and mess with the
user
>>>>>> defined index name, to add for example the cache name to the
user
>>>>>> defined string.
>>>>>> 
>>>>>> Any comment?
>>>>>> 
>>>>>> Cheers,
>>>>>> Sanne
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev(a)lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev(a)lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev(a)lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> 
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> Cheers,
>> -- 
>> Mircea Markus
>> Infinispan lead (www.infinispan.org)
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 
> --
> Galder Zamarreño
> galder(a)redhat.com
> twitter.com/galderz
> 
> Project Lead, Escalante
> http://escalante.io
> 
> Engineer, Infinispan
> http://infinispan.org
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

 Cheers,
 -- 
 Mircea Markus
 Infinispan lead (www.infinispan.org)

 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev 

--
Galder Zamarreño
galder(a)redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Design change in Infinispan Query