hardy: did you follow the metadata API changes?
[2:17pm] hardy: most is ready now. just needs to be merged in from my latest pull request
[2:18pm] hardy: https://github.com/hibernate/hibernate-search/pull/444
[2:18pm] jbossbot: git pull req [hibernate-search] (open) hferentschik HSEARCH-436 Part II - the public metadata api https://github.com/hibernate/hibernate-search/pull/444
[2:18pm] jbossbot: jira [HSEARCH-436] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-436/HSEARCH-436.xml
[2:18pm] hardy: I would like to follow up with HSEARCH-904
[2:18pm] jbossbot: jira [HSEARCH-904] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-904/HSEARCH-904.xml
[2:18pm] emmanuel: hardy: bout the template, I was about to point you to http://design.jboss.org
[2:18pm] emmanuel: but there does nto seem to beany generic template deck
[2:19pm] emmanuel: hardy: but I can send you the community deck I use
[2:19pm] hardy: ok
[2:19pm] emmanuel: hardy: about the metadata API I have not follow
[2:19pm] emmanuel: ed
[2:19pm] hardy: no problem
[2:19pm] hardy: the questions I have a mainly unrelated
[2:19pm] gmorling_ joined the chat room.
[2:20pm] hardy: just one class might be relevant
[2:20pm] hardy: one sec
[2:20pm] hardy: https://github.com/hferentschik/hibernate-search/blob/3aa3bdf166d13bf3c9933b3b70c88f79af87bc72/engine/src/main/java/org/hibernate/search/metadata/FieldDescriptor.java
[2:21pm] jbossbot: git [hibernate-search] 3aa3bdf.. Hardy Ferentschik HSEARCH-1355 Renaming EntityIndexBinder into EntityIndexBinding...
[2:21pm] jbossbot: jira [HSEARCH-1355] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-1355/HSEARCH-1355.xml
[2:21pm] hardy: this is the public metadata interface for a single field
[2:21pm] hardy: thanks for the template
[2:22pm] hardy: the FieldDescriptor is accessed via IndexedTypeDescriptor
[2:22pm] hardy: https://github.com/hferentschik/hibernate-search/tree/3aa3bdf166d13bf3c9933b3b70c88f79af87bc72/engine/src/main/java/org/hibernate/search/metadata
[2:22pm] hardy: so far, so good
[2:22pm] hardy: all is implemented
[2:22pm] gmorling left the chat room. (Ping timeout: 256 seconds)
[2:22pm] gmorling_ is now known as gmorling.
[2:22pm] hardy: what's missing is HSEARCH-904
[2:22pm] jbossbot: jira [HSEARCH-904] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-904/HSEARCH-904.xml
[2:22pm] hardy: but I am not so sure how much this is needed still
[2:23pm] hardy: btw, just interrupt me if I "talk" too fast
[2:23pm] hardy: a lot of context
[2:23pm] emmanuel: IndexedTypeDescriptor is for what a property?
[2:23pm] emmanuel: or a Java type?
[2:23pm] hardy: #904 is about the possibility of Bridges to report which fields they are adding
[2:24pm] hardy: for the whole type
[2:24pm] hardy: mind you these classes are the public interfaces
[2:24pm] hardy: there is a set of internal metadata claases
[2:25pm] hardy: which contain the runtime configured metadata
[2:25pm] hardy: basically the parallel array thing refactored
[2:26pm] hardy: https://github.com/hibernate/hibernate-search/tree/master/engine/src/main/java/org/hibernate/search/engine/metadata/impl
[2:27pm] emmanuel: ok
[2:27pm] hardy: right now the public API reflects what we can know
[2:28pm] hardy: obviously there could be e.g. class bridges which add other fields we don't know about
[2:28pm] hardy: that's where it ties into HSEARCH-904
[2:28pm] jbossbot: jira [HSEARCH-904] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-904/HSEARCH-904.xml
[2:28pm] hardy: if the bridges would report which fields they are adding, it could also be exposed in the metadata api
[2:29pm] hardy: and then there is the optimisation point of view for this issue
[2:29pm] hardy: but tbh I am not so sure how useful this information really would be for us in terms of further optimizations
[2:29pm] emmanuel: BTW for me to understand why do you need a FieldDescriptor.isId
[2:30pm] emmanuel: it's nto a notion present in Lucene AFAIR
[2:30pm] hardy: good point
[2:30pm] hardy: it is our document id
[2:30pm] hardy: I had one a todo item whether all fields should be returned
[2:30pm] hardy: including the id field
[2:31pm] hardy: sanne thought it would be good to return all
[2:31pm] emmanuel: I remember that todo / question
[2:31pm] hardy: right
[2:31pm] emmanuel: yes
[2:31pm] hardy: and the the isId is a way to determine wether it is the document id
[2:31pm] hardy: but I see why this is confiusung
[2:31pm] emmanuel: something surprises me a bit
[2:31pm] hardy: maybe it is not needed!?
[2:31pm] emmanuel: there is no notion of object and property then in this metamodel
[2:32pm] hardy: in the public api no
[2:32pm] emmanuel: ie you say I want index A
[2:32pm] emmanuel: and then you navigate the Lucene "schema"
[2:32pm] hardy: there was a todo for that as well
[2:32pm] hardy: I have the information
[2:32pm] emmanuel: but you don't publically link this schema to the object model
[2:32pm] emmanuel: ok
[2:32pm] hardy: the question is do we want to expose it
[2:33pm] hardy: the internal metamodel is "keyed" against properties
[2:33pm] emmanuel: is that useful still in this flat structure approach
[2:33pm] hardy: I guess it depends where we see the use cases for this public API
[2:33pm] emmanuel: I imagine it helps to write pure Lucene queries
[2:34pm] emmanuel: right
[2:34pm] hardy: for pure Lucene queries you are interested in actual field names
[2:34pm] emmanuel: it could very well be that we need both pure and the "keyed one" as you say
[2:34pm] hardy: it would also allow you to create some smart query parser which maybe suggests field names
[2:34pm] emmanuel: I know gmorling probably needs the keyed one
[2:34pm] emmanuel: when it does JP-QL to Lucene
[2:35pm] hardy: I guess each FieldDescriptor could have some sort of source
[2:35pm] hardy: or maybe SourceDescriptor
[2:35pm] emmanuel: hum not sure
[2:36pm] emmanuel: I mean you navigate the other way around
[2:36pm] hardy: do you?
[2:36pm] emmanuel: no from field to property when you build a query
[2:36pm] emmanuel: s/no/not/
[2:36pm] emmanuel: you wan to target property User.name
[2:36pm] emmanuel: and from there you want to know what's avaialble to you
[2:36pm] emmanuel: name, name_facet etc
[2:37pm] emmanuel: (as field)
[2:37pm] hardy: maybe
[2:37pm] hardy: this is definitely still open for discussion
[2:37pm] emmanuel: right
[2:38pm] emmanuel: it looks like as soon as sanne is back we should do an IRC meeting and discuss that
[2:38pm] emmanuel: mon or tues
[2:38pm] hardy: and how would be "reference" the properties? as java.lang.reflectMemnbers?
[2:38pm] hardy: +1
[2:38pm] emmanuel: hardy: ahh well that's the big question
[2:39pm] hardy: as you can see the internal metadata is basically structured the way you suggest
[2:39pm] emmanuel: I had in mind something like BV but I am biaised
[2:39pm] hardy: but I had a more Document centric approach for the public api
[2:39pm] hardy: also in terms of say a Solr integration
[2:39pm] hardy: really interesting in this case are the actual fields
[2:40pm] emmanuel: depends what you cann Solr integration
[2:40pm] emmanuel: ah well yes in this case form field to "source" makes sense
[2:40pm] emmanuel: s/form/from/
[2:41pm] emmanuel: back to the original question, there are a few situations
[2:41pm] emmanuel: one where you generate a static set of fields but each one can have analyzer / store etc
[2:42pm] hardy: which is the "original" question?
[2:42pm] emmanuel: one where you generate the fields dynamically depending on the value or even the wheather
[2:42pm] emmanuel: in this case listing them in the metamodel is doomed
[2:42pm] hardy: sure, if the fields are that dynamic
[2:42pm] emmanuel: a map where keys represent the field name is the example I have in mind
[2:42pm] hardy: but, often you know the field names
[2:43pm] emmanuel: yes in many cases it is the proposed field name indeed
[2:43pm] hardy: even if you add new ones
[2:43pm] hardy: you know it is for example
[2:43pm] emmanuel: a 1-1 binding between the fieldbridge and the Lucene field
[2:43pm] hardy: taxt_en, text_de, etc
[2:44pm] hardy: you cannot generate complete random names
[2:44pm] emmanuel: assuming you will never ever accept french, that's true
[2:44pm] hardy: at some stage you need to target these fields in a search
[2:44pm] emmanuel: that's another category where you have a pattern
[2:44pm] emmanuel: text_*
[2:44pm] hardy: some sort of fixed list of pattern you need
[2:45pm] emmanuel: not sure we want to model this info though
[2:45pm] hardy: no, not me eitther
[2:45pm] emmanuel: anyways, it looks orthogonal enough to start the metamodel without it
[2:45pm] hardy: and yes, this would be a best effort approach to cover some of the cases which escape the current api
[2:45pm] hardy: +1
[2:46pm] hardy: there are a few more questions around this though
[2:46pm] hardy: e.g., the issue suggests something like
[2:46pm] hardy: public interface FieldNameReportingBridge {
[2:46pm] hardy: Iterable<String> getGeneratedFieldNames(String baseFieldName);
[2:46pm] hardy: }
[2:46pm] hardy: it only returns field names
[2:47pm] hardy: this is only sub optimal for us
[2:47pm] emmanuel: yes I see that now
[2:47pm] emmanuel: you lose the type of field etc
[2:47pm] hardy: to properly report the metadada we would also need the Lucene options
[2:47pm] hardy: right
[2:47pm] hardy: first I thought I can use LuceneOptions
[2:47pm] hardy: but this is not possible
[2:47pm] emmanuel: you should update the issue with your input
[2:48pm] hardy: in fact LuceneOption is a thorn in my eyes anyways
[2:48pm] emmanuel: I't more that we did not think much about it
[2:48pm] hardy: will do
[2:48pm] hardy: it used to contains just options
[2:49pm] hardy: but it became then an actual interface with methods to implement, which is confusing given its name
[2:49pm] hardy: it would be nice to get rid of LuceneOptions in the bridge api
[2:49pm] hardy: that is probably only possible for SEARCH 5 so
[2:50pm] hardy: it might also solve Sanne's issues that we create LuceneOptions for each field
[2:50pm] hardy: including a new Lucene Field instance
[2:50pm] hardy: according to Sanne we should reuse the Fieldable instance
[2:50pm] hardy: again, that's on a tangent
[2:50pm] emmanuel: but LuceneOptions is used
[2:51pm] emmanuel: you would replace it with what?
[2:51pm] emmanuel: the raw Lucene calls are insane
[2:51pm] emmanuel: ah
[2:51pm] emmanuel: right the object reuse mandated by Lucene next
[2:51pm] hardy: well, first we could do the initialise bridge apporach
[2:51pm] hardy: where we pass in the actual options (might be LuceneOption)
[2:52pm] hardy: making FieldBridges in fact stateful
[2:52pm] hardy: and then we offer a helper class to handle the Lucene calls
[2:52pm] emmanuel: but my list of fields might be dynamic
[2:52pm] emmanuel: so passing the Fieldable won't be enough for example
[2:52pm] hardy: you are adding fields via this helper
[2:52pm] emmanuel: I need to be able to create new ones in *some* cases
[2:53pm] hardy: or we provide a IndexConext
[2:53pm] hardy: sure
[2:53pm] hardy: I get that
[2:53pm] hardy: my problem is with the name of the class and that it combines two distinct things
[2:53pm] hardy: also if you look into the implementation, we could have provided the same functionality in a static helper class
[2:53pm] hardy: leaving LuceneOptions as it was
[2:54pm] emmanuel: yes it was for the sake of not breaking things
[2:54pm] hardy: right
[2:54pm] hardy: and it was sub-optimal form the beginning
[2:54pm] hardy: I think it is time to rectify that
[2:55pm] hardy: and as said, using stateful bridges might halve with this Lucene rubber stamp apporach
[2:55pm] hardy: and whether we offer a static helper class or an IndexContext which is passed to the set method is a different isuee
[2:55pm] hardy: I think we need to also discuss this with Sanne
[2:55pm] emmanuel: it is very tied with the Lucene 4 migration
[2:56pm] hardy: he also has the best idea on where is is going in relation to Lucene 4
[2:56pm] hardy: +1
[2:56pm] emmanuel: and if we have to break things anyways, yes that looks like a reasonable option
[2:56pm] hardy: so, back to the issue
[2:56pm] emmanuel: as long as we offer some migration tips
[2:56pm] hardy: I cannot just return strings
[2:56pm] hardy: LuceneOptions is not viable either
[2:57pm] hardy: then I was wondering whether I could return FieldDescriptors
[2:57pm] hardy: you would do something like
[2:57pm] hardy: Set<FieldDescriptor> getGeneratedFields(FieldDescriptor baseFieldDescriptor);
[2:58pm] hardy: you pass the FieldDescriptor as you generate it just form annotations (with the base name and the appropriate options)
[2:59pm] hardy: for our built-in bridges and for many custom bridges you would then just stick this descriptor into a set and return it
[2:59pm] hardy: but you can also create new ones depending on what your bridge does
[2:59pm] emmanuel: hardy: you know what is sad. With a high enough API (say the action moethds of LuceneOptions), we know what the bridge creates (name, type whether it is stored etc)
[2:59pm] hardy: to generate new ones we could offer a FieldDescriptorFactyory
[3:00pm] emmanuel: and I always found it sad that we had to ask this data again statically with getGeneratedFields
[3:00pm] emmanuel: but we do need this data outside the actual field creation
[3:01pm] emmanuel: so the extra method looks necessary
[3:01pm] hardy: right
[3:01pm] hardy: but what do you think about the FieldDescriptor thingy
[3:01pm] emmanuel: your proposal makes sense
[3:02pm] hardy: what feels strange to me is to use a metadata class here
[3:02pm] emmanuel: returning an empty set might mean dymanic
[3:02pm] hardy: right
[3:02pm] emmanuel: or maybe null whatever
[3:02pm] emmanuel: and the default impl would return a Set with basedFieldDescriptor
[3:02pm] hardy: right
[3:03pm] hardy: there is one thing which I am not so happy about so
[3:03pm] hardy: right now, the FieldDescriptor contains getFieldBridge and getAnalyzer
[3:04pm] hardy: the rest of the information is based on the name and the lucene options
[3:04pm] hardy: not sure what to do with the other two
[3:04pm] hardy: and whether we really need them in the public API
[3:04pm] hardy: I guess getFieldBridge could returns "this"
[3:05pm] hardy: if you really create new FieldDescriptors in your custom bridge, the bridge doing so is of course known
[3:05pm] emmanuel: today can FB use a custom analyzer?
[3:05pm] hardy: right analysers are the problem
[3:05pm] emmanuel: in a way you have two notions
[3:05pm] hardy: there is no link between them and the bridge
[3:05pm] emmanuel: FieldInfoDescriptor
[3:05pm] emmanuel: and FieldDescriptor
[3:06pm] hardy: even though we had feature requests to access analysers in field bridges
[3:06pm] emmanuel: the former does nto have analyzer and fieldBridge
[3:06pm] hardy: right, that was a thought of mine as well
[3:06pm] hardy: one could split up FieldDescriptor
[3:06pm] hardy: into pure Lucene Document field info and the rest
[3:06pm] emmanuel: split or superclassing
[3:07pm] hardy: the field budge method would then return a set of FieldInfoDescriptors
[3:07pm] hardy: sure
[3:07pm] hardy: actually the more I think about it, the more i Iike it
[3:08pm] hardy: hmm, I got some more ideas now
[3:08pm] hardy: to sum things up a little
[3:09pm] hardy: #1 we need to discuss whether the public metadata api should expose property data (aka which property creates the field)
[3:10pm] hardy: #2 if so, we need to decide how to add this information to the APi. Use a PropertyDescriptor (having a name and access type I guess)? Where to add the info (as part of the FieldDescriptor or more type centric where you navigate type -> property--> field)
[3:11pm] hardy: ## Regarding #904 returning just a set of field names is not sufficient. We really need also the Lucene specific options, aka a et of FieldDescriptors
[3:12pm] hardy: #4 FieldDescriptor should potentially be split up into FieldInfoDecriptor and "rest"
[3:13pm] emmanuel: back (got caught by the mkt)
[3:13pm] hardy: #5 LuceneOptions is sub-optimal and we might consider removing it for Search 5. Maybe making bridges stateful!? Need to discuss with Sanne regarding the new Lucene 4 way of creating fields
[3:14pm] emmanuel: good sum up
[3:14pm] emmanuel: about 5
[3:14pm] emmanuel: what's suboptimal really is the non action methods right?
[3:14pm] emmanuel: ie the state like Compress and co
[3:14pm] hardy: right, the mixing of the two
[3:15pm] emmanuel: then we agree
[3:15pm] hardy: if you remove these it is the name which bugs me
[3:15pm] emmanuel: and yes 5b. consider making bridges stateful to reuse instances ala lunce e4
[3:15pm] hardy: then we have two methods addFieldToDocument and addNumericFieldToDocument
[3:15pm] hardy: but the class is called LuceneOptions
[3:15pm] emmanuel: ok
[3:15pm] hardy: in this case we need at least a rename to IndexContext
[3:16pm] hardy: indexContext#addFieldToDocument makes so much more sense
[3:16pm] hardy: luceneOptionst#addFieldToDocument just keeps you windering
[3:16pm] hardy: how can options do anything
[3:17pm] hardy: anyways, thanks for the discussion
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
Adding a discussion between Emmanuel Bernard and Hardy Ferentschik on IRC:
hardy: did you follow the metadata API changes? [2:17pm] hardy: most is ready now. just needs to be merged in from my latest pull request [2:18pm] hardy: https://github.com/hibernate/hibernate-search/pull/444 [2:18pm] jbossbot: git pull req [hibernate-search] (open) hferentschik HSEARCH-436 Part II - the public metadata api https://github.com/hibernate/hibernate-search/pull/444 [2:18pm] jbossbot: jira [HSEARCH-436] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-436/HSEARCH-436.xml [2:18pm] hardy: I would like to follow up with HSEARCH-904 [2:18pm] jbossbot: jira [HSEARCH-904] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-904/HSEARCH-904.xml [2:18pm] emmanuel: hardy: bout the template, I was about to point you to http://design.jboss.org [2:18pm] emmanuel: but there does nto seem to beany generic template deck [2:19pm] emmanuel: hardy: but I can send you the community deck I use [2:19pm] hardy: ok [2:19pm] emmanuel: hardy: about the metadata API I have not follow [2:19pm] emmanuel: ed [2:19pm] hardy: no problem [2:19pm] hardy: the questions I have a mainly unrelated [2:19pm] gmorling_ joined the chat room. [2:20pm] hardy: just one class might be relevant [2:20pm] hardy: one sec [2:20pm] hardy: https://github.com/hferentschik/hibernate-search/blob/3aa3bdf166d13bf3c9933b3b70c88f79af87bc72/engine/src/main/java/org/hibernate/search/metadata/FieldDescriptor.java [2:21pm] jbossbot: git [hibernate-search] 3aa3bdf.. Hardy Ferentschik HSEARCH-1355 Renaming EntityIndexBinder into EntityIndexBinding... [2:21pm] jbossbot: jira [HSEARCH-1355] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-1355/HSEARCH-1355.xml [2:21pm] hardy: this is the public metadata interface for a single field [2:21pm] hardy: thanks for the template [2:22pm] hardy: the FieldDescriptor is accessed via IndexedTypeDescriptor [2:22pm] hardy: https://github.com/hferentschik/hibernate-search/tree/3aa3bdf166d13bf3c9933b3b70c88f79af87bc72/engine/src/main/java/org/hibernate/search/metadata [2:22pm] hardy: so far, so good [2:22pm] hardy: all is implemented [2:22pm] gmorling left the chat room. (Ping timeout: 256 seconds) [2:22pm] gmorling_ is now known as gmorling. [2:22pm] hardy: what's missing is HSEARCH-904 [2:22pm] jbossbot: jira [HSEARCH-904] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-904/HSEARCH-904.xml [2:22pm] hardy: but I am not so sure how much this is needed still [2:23pm] hardy: btw, just interrupt me if I "talk" too fast [2:23pm] hardy: a lot of context [2:23pm] emmanuel: IndexedTypeDescriptor is for what a property? [2:23pm] emmanuel: or a Java type? [2:23pm] hardy: #904 is about the possibility of Bridges to report which fields they are adding [2:24pm] hardy: for the whole type [2:24pm] hardy: mind you these classes are the public interfaces [2:24pm] hardy: there is a set of internal metadata claases [2:25pm] hardy: which contain the runtime configured metadata [2:25pm] hardy: basically the parallel array thing refactored [2:26pm] hardy: https://github.com/hibernate/hibernate-search/tree/master/engine/src/main/java/org/hibernate/search/engine/metadata/impl [2:27pm] emmanuel: ok [2:27pm] hardy: right now the public API reflects what we can know [2:28pm] hardy: obviously there could be e.g. class bridges which add other fields we don't know about [2:28pm] hardy: that's where it ties into HSEARCH-904 [2:28pm] jbossbot: jira [HSEARCH-904] Redirected to: https://hibernate.atlassian.net/si/jira.issueviews:issue-xml/HSEARCH-904/HSEARCH-904.xml [2:28pm] hardy: if the bridges would report which fields they are adding, it could also be exposed in the metadata api [2:29pm] hardy: and then there is the optimisation point of view for this issue [2:29pm] hardy: but tbh I am not so sure how useful this information really would be for us in terms of further optimizations [2:29pm] emmanuel: BTW for me to understand why do you need a FieldDescriptor.isId [2:30pm] emmanuel: it's nto a notion present in Lucene AFAIR [2:30pm] hardy: good point [2:30pm] hardy: it is our document id [2:30pm] hardy: I had one a todo item whether all fields should be returned [2:30pm] hardy: including the id field [2:31pm] hardy: sanne thought it would be good to return all [2:31pm] emmanuel: I remember that todo / question [2:31pm] hardy: right [2:31pm] emmanuel: yes [2:31pm] hardy: and the the isId is a way to determine wether it is the document id [2:31pm] hardy: but I see why this is confiusung [2:31pm] emmanuel: something surprises me a bit [2:31pm] hardy: maybe it is not needed!? [2:31pm] emmanuel: there is no notion of object and property then in this metamodel [2:32pm] hardy: in the public api no [2:32pm] emmanuel: ie you say I want index A [2:32pm] emmanuel: and then you navigate the Lucene "schema" [2:32pm] hardy: there was a todo for that as well [2:32pm] hardy: I have the information [2:32pm] emmanuel: but you don't publically link this schema to the object model [2:32pm] emmanuel: ok [2:32pm] hardy: the question is do we want to expose it [2:33pm] hardy: the internal metamodel is "keyed" against properties [2:33pm] emmanuel: is that useful still in this flat structure approach [2:33pm] hardy: I guess it depends where we see the use cases for this public API [2:33pm] emmanuel: I imagine it helps to write pure Lucene queries [2:34pm] emmanuel: right [2:34pm] hardy: for pure Lucene queries you are interested in actual field names [2:34pm] emmanuel: it could very well be that we need both pure and the "keyed one" as you say [2:34pm] hardy: it would also allow you to create some smart query parser which maybe suggests field names [2:34pm] emmanuel: I know gmorling probably needs the keyed one [2:34pm] emmanuel: when it does JP-QL to Lucene [2:35pm] hardy: I guess each FieldDescriptor could have some sort of source [2:35pm] hardy: or maybe SourceDescriptor [2:35pm] emmanuel: hum not sure [2:36pm] emmanuel: I mean you navigate the other way around [2:36pm] hardy: do you? [2:36pm] emmanuel: no from field to property when you build a query [2:36pm] emmanuel: s/no/not/ [2:36pm] emmanuel: you wan to target property User.name [2:36pm] emmanuel: and from there you want to know what's avaialble to you [2:36pm] emmanuel: name, name_facet etc [2:37pm] emmanuel: (as field) [2:37pm] hardy: maybe [2:37pm] hardy: this is definitely still open for discussion [2:37pm] emmanuel: right [2:38pm] emmanuel: it looks like as soon as sanne is back we should do an IRC meeting and discuss that [2:38pm] emmanuel: mon or tues [2:38pm] hardy: and how would be "reference" the properties? as java.lang.reflectMemnbers? [2:38pm] hardy: +1 [2:38pm] emmanuel: hardy: ahh well that's the big question [2:39pm] hardy: as you can see the internal metadata is basically structured the way you suggest [2:39pm] emmanuel: I had in mind something like BV but I am biaised [2:39pm] hardy: but I had a more Document centric approach for the public api [2:39pm] hardy: also in terms of say a Solr integration [2:39pm] hardy: really interesting in this case are the actual fields [2:40pm] emmanuel: depends what you cann Solr integration [2:40pm] emmanuel: ah well yes in this case form field to "source" makes sense [2:40pm] emmanuel: s/form/from/ [2:41pm] emmanuel: back to the original question, there are a few situations [2:41pm] emmanuel: one where you generate a static set of fields but each one can have analyzer / store etc [2:42pm] hardy: which is the "original" question? [2:42pm] emmanuel: one where you generate the fields dynamically depending on the value or even the wheather [2:42pm] emmanuel: in this case listing them in the metamodel is doomed [2:42pm] hardy: sure, if the fields are that dynamic [2:42pm] emmanuel: a map where keys represent the field name is the example I have in mind [2:42pm] hardy: but, often you know the field names [2:43pm] emmanuel: yes in many cases it is the proposed field name indeed [2:43pm] hardy: even if you add new ones [2:43pm] hardy: you know it is for example [2:43pm] emmanuel: a 1-1 binding between the fieldbridge and the Lucene field [2:43pm] hardy: taxt_en, text_de, etc [2:44pm] hardy: you cannot generate complete random names [2:44pm] emmanuel: assuming you will never ever accept french, that's true [2:44pm] hardy: at some stage you need to target these fields in a search [2:44pm] emmanuel: that's another category where you have a pattern [2:44pm] emmanuel: text_* [2:44pm] hardy: some sort of fixed list of pattern you need [2:45pm] emmanuel: not sure we want to model this info though [2:45pm] hardy: no, not me eitther [2:45pm] emmanuel: anyways, it looks orthogonal enough to start the metamodel without it [2:45pm] hardy: and yes, this would be a best effort approach to cover some of the cases which escape the current api [2:45pm] hardy: +1 [2:46pm] hardy: there are a few more questions around this though [2:46pm] hardy: e.g., the issue suggests something like [2:46pm] hardy: public interface FieldNameReportingBridge { [2:46pm] hardy: Iterable<String> getGeneratedFieldNames(String baseFieldName); [2:46pm] hardy: } [2:46pm] hardy: it only returns field names [2:47pm] hardy: this is only sub optimal for us [2:47pm] emmanuel: yes I see that now [2:47pm] emmanuel: you lose the type of field etc [2:47pm] hardy: to properly report the metadada we would also need the Lucene options [2:47pm] hardy: right [2:47pm] hardy: first I thought I can use LuceneOptions [2:47pm] hardy: but this is not possible [2:47pm] emmanuel: you should update the issue with your input [2:48pm] hardy: in fact LuceneOption is a thorn in my eyes anyways [2:48pm] emmanuel: I't more that we did not think much about it [2:48pm] hardy: will do [2:48pm] hardy: it used to contains just options [2:49pm] hardy: but it became then an actual interface with methods to implement, which is confusing given its name [2:49pm] hardy: it would be nice to get rid of LuceneOptions in the bridge api [2:49pm] hardy: that is probably only possible for SEARCH 5 so [2:50pm] hardy: it might also solve Sanne's issues that we create LuceneOptions for each field [2:50pm] hardy: including a new Lucene Field instance [2:50pm] hardy: according to Sanne we should reuse the Fieldable instance [2:50pm] hardy: again, that's on a tangent [2:50pm] emmanuel: but LuceneOptions is used [2:51pm] emmanuel: you would replace it with what? [2:51pm] emmanuel: the raw Lucene calls are insane [2:51pm] emmanuel: ah [2:51pm] emmanuel: right the object reuse mandated by Lucene next [2:51pm] hardy: well, first we could do the initialise bridge apporach [2:51pm] hardy: where we pass in the actual options (might be LuceneOption) [2:52pm] hardy: making FieldBridges in fact stateful [2:52pm] hardy: and then we offer a helper class to handle the Lucene calls [2:52pm] emmanuel: but my list of fields might be dynamic [2:52pm] emmanuel: so passing the Fieldable won't be enough for example [2:52pm] hardy: you are adding fields via this helper [2:52pm] emmanuel: I need to be able to create new ones in *some* cases [2:53pm] hardy: or we provide a IndexConext [2:53pm] hardy: sure [2:53pm] hardy: I get that [2:53pm] hardy: my problem is with the name of the class and that it combines two distinct things [2:53pm] hardy: also if you look into the implementation, we could have provided the same functionality in a static helper class [2:53pm] hardy: leaving LuceneOptions as it was [2:54pm] emmanuel: yes it was for the sake of not breaking things [2:54pm] hardy: right [2:54pm] hardy: and it was sub-optimal form the beginning [2:54pm] hardy: I think it is time to rectify that [2:55pm] hardy: and as said, using stateful bridges might halve with this Lucene rubber stamp apporach [2:55pm] hardy: and whether we offer a static helper class or an IndexContext which is passed to the set method is a different isuee [2:55pm] hardy: I think we need to also discuss this with Sanne [2:55pm] emmanuel: it is very tied with the Lucene 4 migration [2:56pm] hardy: he also has the best idea on where is is going in relation to Lucene 4 [2:56pm] hardy: +1 [2:56pm] emmanuel: and if we have to break things anyways, yes that looks like a reasonable option [2:56pm] hardy: so, back to the issue [2:56pm] emmanuel: as long as we offer some migration tips [2:56pm] hardy: I cannot just return strings [2:56pm] hardy: LuceneOptions is not viable either [2:57pm] hardy: then I was wondering whether I could return FieldDescriptors [2:57pm] hardy: you would do something like [2:57pm] hardy: Set<FieldDescriptor> getGeneratedFields(FieldDescriptor baseFieldDescriptor); [2:58pm] hardy: you pass the FieldDescriptor as you generate it just form annotations (with the base name and the appropriate options) [2:59pm] hardy: for our built-in bridges and for many custom bridges you would then just stick this descriptor into a set and return it [2:59pm] hardy: but you can also create new ones depending on what your bridge does [2:59pm] emmanuel: hardy: you know what is sad. With a high enough API (say the action moethds of LuceneOptions), we know what the bridge creates (name, type whether it is stored etc) [2:59pm] hardy: to generate new ones we could offer a FieldDescriptorFactyory [3:00pm] emmanuel: and I always found it sad that we had to ask this data again statically with getGeneratedFields [3:00pm] emmanuel: but we do need this data outside the actual field creation [3:01pm] emmanuel: so the extra method looks necessary [3:01pm] hardy: right [3:01pm] hardy: but what do you think about the FieldDescriptor thingy [3:01pm] emmanuel: your proposal makes sense [3:02pm] hardy: what feels strange to me is to use a metadata class here [3:02pm] emmanuel: returning an empty set might mean dymanic [3:02pm] hardy: right [3:02pm] emmanuel: or maybe null whatever [3:02pm] emmanuel: and the default impl would return a Set with basedFieldDescriptor [3:02pm] hardy: right [3:03pm] hardy: there is one thing which I am not so happy about so [3:03pm] hardy: right now, the FieldDescriptor contains getFieldBridge and getAnalyzer [3:04pm] hardy: the rest of the information is based on the name and the lucene options [3:04pm] hardy: not sure what to do with the other two [3:04pm] hardy: and whether we really need them in the public API [3:04pm] hardy: I guess getFieldBridge could returns "this" [3:05pm] hardy: if you really create new FieldDescriptors in your custom bridge, the bridge doing so is of course known [3:05pm] emmanuel: today can FB use a custom analyzer? [3:05pm] hardy: right analysers are the problem [3:05pm] emmanuel: in a way you have two notions [3:05pm] hardy: there is no link between them and the bridge [3:05pm] emmanuel: FieldInfoDescriptor [3:05pm] emmanuel: and FieldDescriptor [3:06pm] hardy: even though we had feature requests to access analysers in field bridges [3:06pm] emmanuel: the former does nto have analyzer and fieldBridge [3:06pm] hardy: right, that was a thought of mine as well [3:06pm] hardy: one could split up FieldDescriptor [3:06pm] hardy: into pure Lucene Document field info and the rest [3:06pm] emmanuel: split or superclassing [3:07pm] hardy: the field budge method would then return a set of FieldInfoDescriptors [3:07pm] hardy: sure [3:07pm] hardy: actually the more I think about it, the more i Iike it [3:08pm] hardy: hmm, I got some more ideas now [3:08pm] hardy: to sum things up a little [3:09pm] hardy: #1 we need to discuss whether the public metadata api should expose property data (aka which property creates the field) [3:10pm] hardy: #2 if so, we need to decide how to add this information to the APi. Use a PropertyDescriptor (having a name and access type I guess)? Where to add the info (as part of the FieldDescriptor or more type centric where you navigate type -> property--> field) [3:11pm] hardy: ## Regarding #904 returning just a set of field names is not sufficient. We really need also the Lucene specific options, aka a et of FieldDescriptors [3:12pm] hardy: #4 FieldDescriptor should potentially be split up into FieldInfoDecriptor and "rest" [3:13pm] emmanuel: back (got caught by the mkt) [3:13pm] hardy: #5 LuceneOptions is sub-optimal and we might consider removing it for Search 5. Maybe making bridges stateful!? Need to discuss with Sanne regarding the new Lucene 4 way of creating fields [3:14pm] emmanuel: good sum up [3:14pm] emmanuel: about 5 [3:14pm] emmanuel: what's suboptimal really is the non action methods right? [3:14pm] emmanuel: ie the state like Compress and co [3:14pm] hardy: right, the mixing of the two [3:15pm] emmanuel: then we agree [3:15pm] hardy: if you remove these it is the name which bugs me [3:15pm] emmanuel: and yes 5b. consider making bridges stateful to reuse instances ala lunce e4 [3:15pm] hardy: then we have two methods addFieldToDocument and addNumericFieldToDocument [3:15pm] hardy: but the class is called LuceneOptions [3:15pm] emmanuel: ok [3:15pm] hardy: in this case we need at least a rename to IndexContext [3:16pm] hardy: indexContext#addFieldToDocument makes so much more sense [3:16pm] hardy: luceneOptionst#addFieldToDocument just keeps you windering [3:16pm] hardy: how can options do anything [3:17pm] hardy: anyways, thanks for the discussion