[infinispan-dev] Infinispan Query entity discovery (Was: Re: new Infinispan Query API - ISPN-194)

Wed Apr 27 04:10:42 EDT 2011

On 27 avr. 2011, at 08:57, Sanne Grinovero wrote:

> 2011/4/27 Emmanuel Bernard <emmanuel at hibernate.org>:
>> Users can put indexed or nit indexed superclasses in the query target type. That would not work for you as you can't discover known subtypes wo scanning or having a closure of types somewhere.
> 
> sure they can with Hibernate Search. but should they be able with
> Infinispan Query?
> If the answer is yes, then we still need to find an alternative.

Well it's an OO query and thus subtype polymorphism should apply.

> 
> 
>> On 26 avr. 2011, at 23:32, Sanne Grinovero <sanne.grinovero at gmail.com> wrote:
>> 
>>> Hello,
>>> I'm forking off this thread, as we never resolved how to cope with the
>>> main issue:
>>> 
>>> how is Infinispan Query going to be aware of which entities are to be
>>> considered as default targets for a Query?
>>> 
>>> the realistic ideas so far:
>>> A) class scanning: seems nobody liked this idea, but I'll still
>>> mention it as the other options aren't looking great either.
>>> B) scan known indexes (need to define what the "known indexes" are as
>>> we usually infer that from the classes)
>>>   -- could enforce a single index
>>> C) have to list all fully qualified class names in the configuration
>>> D) don't care: consider it a good practice to specify all targeted
>>> types when performing a Query.
>>> E) please suggest :)
>>> 
>>> The currently implemented solution is D, as it requires no coding at all :)
>>> considering the simplicity of it I'm liking it more the more I think
>>> about it; I could even polish the approach by adding a single line to
>>> log a warning when the user doesn't respect the best practice, or to
>>> mandate it.
>>> 
>>> Considering that when a Query is invoked specifying the target types
>>> there is no doubt we know the classes, I could add a warning in case
>>> the Query is performed without specifying the type: in that case it
>>> usually implies the query targets all known types, which is always
>>> fine when using Hibernate Search, but could be inconsistent with
>>> Infinispan Query as it might not have discovered all types yet (1), so
>>> a very simple solution is to mandate the type parameter.
>>> 
>>> [1] - when the Cache interceptor hits an event adding a new type, the
>>> Search engine is reconfigured.
>>> 
>>> thoughts?
>>> 
>>> Sanne
>>> 
>>> 
>>> 2011/4/5 Emmanuel Bernard <emmanuel at hibernate.org>:
>>>> 
>>>> On 5 avr. 2011, at 13:38, Sanne Grinovero wrote:
>>>> 
>>>>> 2011/4/5 Emmanuel Bernard <emmanuel at hibernate.org>:
>>>>>> 
>>>>>> On 5 avr. 2011, at 12:20, Galder Zamarreño wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Apr 4, 2011, at 6:23 PM, Sanne Grinovero wrote:
>>>>>>> 
>>>>>>>> </snip>
>>>>>>>> 
>>>>>>>> there's one catch:
>>>>>>>> when searching for a class type, it will only include results from
>>>>>>>> known subtypes. The targeted type is automatically added to the known
>>>>>>>> classes, but eventually existing subtypes are not discovered.
>>>>>>>> 
>>>>>>>> Bringing this issue to an extreme, if the query is not targeting any
>>>>>>>> type, and no indexed types where added to the grid (even if some exist
>>>>>>>> already as they might have been inserted by other JVMs or previous
>>>>>>>> runs), all queries will return no results.
>>>>>>>> How to solve this?
>>>>>>>> - class scanning?
>>>>>>> 
>>>>>>> Nope, too expensive.
>>>>>>> 
>>>>>>>> - explicitly list indexed entities in Infinispan configuration?
>>>>>>> 
>>>>>>> No
>>>>>>> 
>>>>>>>> - a metadata cache maintaining a distributed&stored copy of known types
>>>>>>> 
>>>>>>> That sounds more appealing. It could be a good middle ground until Search can search for types.
>>>>>> 
>>>>>> Do you have any specific idea in mind?
>>>>>> 
>>>>>> To magically find types:
>>>>>>  - we scan every file system, databases, caches available to the app and look for Lucene metadata => unrealistic
>>>>>>  - there is some kind of convention on where the indexes are and we do index scanning at startup => scanning are very likely to be slower that class scanning (potential remote access, bigger dataset etc)
>>>>>>  - we enforce one or a fixed number of Lucene indexes for all data in Infinispan => not sure that's a good idea but this can be explored
>>>>>>  - we somehow ask the framework using HSearch to fill up classes
>>>>>> 
>>>>>> other approaches?
>>>>> 
>>>>> why was class scanning discarded in the first answer? as H. Search can
>>>>> auto-discover classes by working on top of JPA entity autodiscovery, I
>>>>> guess that each application node could look into it's own known
>>>>> classpath.
>>>>> After all if some type is not visible to him as it was added from
>>>>> another node from a different app, he won't be able to return
>>>>> instances of it either.
>>>>> We could face the opposite problem of building metadata of classes
>>>>> people doesn't mean to index in this cache.
>>>> 
>>>> Right. scanning (class or index) will be a bit aggressive and could build unneeded metadata (or even worse, return unexpected classes).
>>