With these new types,backwards compatibility is a non-issue. So
unless
someone makes a strong case for needing these as String in the index,
what about we drop some complexity?
ElasticSearch uses Strings for transferring dates in JSON structures
(see
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-d...).
So for that backend we'll need String-mapping field bridges (and we'd
even have to ignore/override/flag as error the user's setting for
numeric mapping).
2015-08-10 12:37 GMT+02:00 Sanne Grinovero <sanne(a)hibernate.org>:
> On 10 August 2015 at 11:04, Hardy Ferentschik <hardy(a)hibernate.org> wrote:
>> Hi,
>>
>> sorry, I am late to the game, but I here are some more thoughts on this.
>>
>> I think the consensus so far is that
>>
>> # Date/time types which represent an instant in time are treated as usual.
>> They can be string encoded (per default yyyyMMddHHmmssSSS) or numerically
>> in which case the numeric long value equals the epoch time of the represented
>> date.
>
> Correct that's the consensus so far. I'd like to challenge one more
> detail though:
> does it still make sense to allow string-encoded?
>
> I think not, we did allow it primarily because a long time ago that
> was the only way, then it became one of the options -but still the
> default - and more recently it became the non-default way.
>
With these new types,backwards compatibility is a non-issue. So
unless
someone makes a strong case for needing these as String in the index,
what about we drop some complexity?
>
> Remember:
> - Hibernate Search is not an Objects/index mapper so we're not aiming
> at creating any index schema possible, we're aiming at taking
> advantage of the index for practical purposes ("I want it to be a
> string in the index" is not a valid argument - use your own
> fieldbridge in case)
> - With Projections we have to re-transform things back into their
> Java original type, so how we encode things in the index is irrelevant
> from a semantics point of view; I think the only valid challenge would
> need to come from a performance or storage space perspective, in both
> cases I'm pretty sure the numeric encoding would win.
>
>> # Date/time types which do not represent an instant in time can also be
>> encoded as string or number, but in the latter case the numeric representation
>> is given by interpreting the string representation as number.
>>
>> So far so good. There are a couple of more things to think about.
>>
>> # Query time gets interesting and I think we need to improve the DSL in unison
>> with adding support for these new types. Check out this example from DSLTest
[1]
>>
>> query = monthQb
>> .range()
>> .onField( "estimatedCreation"
)
>> .ignoreFieldBridge()
>> .andField( "justfortest" )
>>
.ignoreFieldBridge().ignoreAnalyzer()
>> .from( DateTools.round( from,
DateTools.Resolution.MINUTE ) )
>> .to( DateTools.round( to,
DateTools.Resolution.MINUTE ) )
>> .excludeLimit()
>> .createQuery();
>>
>> If a date is numerically encoded you need to specify numbers for the from and to
values. ATM,
>> we recommend to use the Lucene specific DateTools to get the numeric
representation. With the support
>> ofthe new date types things will get confusing for the user. How does one
"create" the numeric representation
>> of a LocalDate (and how does one know how it looks like in the first place and
how it differs from the epoch time)?
>
> Great point, we should accept the user's domain type exclusively and
> take the conversion burden from the user; especially since we know the
> correct conversion strategy.
>
>> We have been discussing before whether Hibernate Search needs to offer its own
version of DateTools.
>> I think it would be time to do so and include helpers for the new date/time
types. This also reduces the exposure
>> to Lucene specific types.
>
> +1 to encapsulate it, but I don't expect people to need it at all in
> the above case? But good for other more advanced needs.
>
>>
>> Even better though would be, if we would be able to support directly the use of
date types in the from and to clauses.
>> It would be the responsibility of the DSL to round the specified types to the
appropriate level based on the field's
>> configuration/metadata. Even in this scenario though a Search specific DateTools
might be necessary for the cases
>> where the date specified in to/from needs to be rounded differently than the
field itself.
>
> +1
>
>> Last but not least, the documentation needs to be updated. At the moment, the
docs are silent about all the complexity
>> around dates. With the support of the new types, the docs needs to be more
explicit and describe the subtleties at play.
>
> +1 created HSEARCH-1958
>
> Thanks,
> Sanne
>
>
>>
>> --Hardy
>>
>>
>> On Wed, Aug 05, 2015 at 05:40:16PM +0100, Sanne Grinovero wrote:
>>> On 5 August 2015 at 17:22, Davide D'Alto <davide(a)hibernate.org>
wrote:
>>> >> Proposal: use numeric but still - rather than taking the
milliseconds
>>> >> from epoch, take the resulting number from YYYYMMDD ?
>>> >
>>> > I don't think I understand what you mean with "the resulting
number from
>>> > YYYYMMDD".
>>> > Wouldn't be similar to get the number of days from epoch?
>>>
>>> No because epoch is a specific moment *with a timezone*. If you take a
>>> calendar date "here", and take the moment in time which represents
>>> your beginning of the calendar date, the distance from epoch is not a
>>> whole number and you'd have to apply rounding which is timezone
>>> specific.
>>>
>>> By simply encoding the number in the above format, you'd encode today
>>> as the number "20150805".
>>> That's a whole number which avoids the timezone relativity and can be
>>> efficiently encoded in numeric form, and provides the expected sorting
>>> properties.
>>>
>>> >
>>> > But basically, you are saying that I can use different numeric encoding
for
>>> > different types. Isn't it?
>>>
>>> Yes, you definitely need different encodings depending on the type and
>>> the used options.
>>>
>>> > So, for example:
>>> >
>>> > java.util.Date, java.util.Calendar and java.time.Instant,
>>> > java.time.LocalDateTime will use number of miliseconds from epoch
>>> > java.time.LocalDate: number of days from epoch
>>>
>>> Except this one ^ I agree with the others.
>>>
>>> > java.time.LocalTime: number of nanos in a day
>>>
>>> Conceptually, yes.. but we don't have "nanoseconds" as an
option of
>>> org.hibernate.search.annotations.Resolution. Should we add it?
>>> We would not be able to apply that Resolution on old fashioned
>>> Date/Calendar, so that would need a warning or even an exception when
>>> applied to old style value types.
>>>
>>> >> Ok that works but why write all those zeros in the index, when you
can
>>> >> just write the date. I realize storage is cheap, but still we need
to
>>> >> be careful as the index size affects performance ;-)
>>> >
>>> > I don't think we need to store the 0s.
>>> > If I know the type of the field I already know the the time is 0.
>>>
>>> Exactly
>>>
>>> > Am I missing something?
>>>
>>> I probably just misunderstood your proposal, since previously you
>>> mentioned: "I would just consider a LocalDate the same as a
>>> LocalDateTime with time 00:00:000 (UTC time zone)".
>>> If you have to write the days only you don't need to convert to a time
first.
>>> This misunderstanding might be related with the fact that you were
>>> planning to encode as distance from epoch.. see my first comment on
>>> this same email.
>>> Since you don't want to look at distance from epoch for this case, the
>>> time component really is irrelevant and LocalDate has all the
>>> information you need.. simpler ;)
>>>
>>> Sanne
>>>
>>>
>>> >
>>> >
>>> > On Wed, Aug 5, 2015 at 5:00 PM, Sanne Grinovero
<sanne(a)hibernate.org> wrote:
>>> >
>>> >> On 5 August 2015 at 16:27, Gunnar Morling
<gunnar(a)hibernate.org> wrote:
>>> >> >> as I'd like us to consider not
>>> >> > applying DateBridge on the new types as it doesn't seem to
add much
>>> >> > practical value.
>>> >> >
>>> >> > Ok, that may make sense for types such as LocalDate. But there
are types
>>> >> in
>>> >> > the new API which - unlike LocalDate - do describe an exact
instant on
>>> >> the
>>> >> > time line (e.g. ZonedDateTime, Instant). For those IMO it makes
sense for
>>> >> > sure to support both encodings, NUMERIC and STRING (similar to
>>> >> Date/Calendar
>>> >> > so far) and thus apply @DateBridge.
>>> >>
>>> >> +1
>>> >>
>>> >> > Question is whether/how to index/persist TZ information, for
Calendar it
>>> >> > seems not been persisted in the index so far?
>>> >>
>>> >> It's encoding the Calendar's time as distance from epoch,
which is a
>>> >> neutral encoding so you don't need the TZ.
>>> >>
>>> >> For the old style Date/Calendar types we always assumed the value
was
>>> >> a point-in-time, unless explicitly opting in for an alternative
>>> >> encoding.
>>> >> For example for the "birthday use case" a reasonable
setting would
>>> >> have been String encoding with resolution=DAY, although passing in
a
>>> >> Date instance having the right value (as in right timezone) would
have
>>> >> been user's responsibility.. we simply take the long it's
storing and
>>> >> index that with the requested resolution.
>>> >>
>>> >> Sanne
>>> >>
>>> >> >
>>> >> >
>>> >> > 2015-08-05 17:10 GMT+02:00 Sanne Grinovero
<sanne(a)hibernate.org>:
>>> >> >>
>>> >> >> Inline:
>>> >> >>
>>> >> >> On 5 August 2015 at 15:42, Davide D'Alto
<davide(a)hibernate.org> wrote:
>>> >> >> > If a user select a resolution that does not make much
sense we can
>>> >> log a
>>> >> >> > warning.
>>> >> >>
>>> >> >> +1 And update the javadoc to mention that some resolution
values don't
>>> >> >> apply
>>> >> >>
>>> >> >> > But I think this might make sense:
>>> >> >> >
>>> >> >> > @DateBridge(resolution=MONTH)
>>> >> >> > LocalDate birthday;
>>> >> >>
>>> >> >> Ok but how often do you think that will be used?
>>> >> >> Sorry playing devil's advocate here, as I'd like us
to consider not
>>> >> >> applying DateBridge on the new types as it doesn't seem
to add much
>>> >> >> practical value.
>>> >> >>
>>> >> >> I agree it's worth a shot, but while going ahead keep
in mind that
>>> >> >> maybe simplifying that is the more elegant solution.
>>> >> >>
>>> >> >> > On Wed, Aug 5, 2015 at 3:37 PM, Davide D'Alto
<davide(a)hibernate.org>
>>> >> >> > wrote:
>>> >> >> >
>>> >> >> >> > What would you do though in case of the
following:
>>> >> >> >> >
>>> >> >> >> > @DateBridge
>>> >> >> >> > LocalDate myDate;
>>> >> >> >> >
>>> >> >> >> > encoding() defaults to NUMERIC, so would you
a) raise an error, or
>>> >> b)
>>> >> >> >> ignore encoding() for LocalDate and friends? Both
seem not right to
>>> >> me.
>>> >> >> >> I
>>> >> >> >> think there is nothing wrong with using NUMERIC
encoding per-se for
>>> >> >> >> these
>>> >> >> >> types. We may recommend STRING but if NUMERIC
really is what a user
>>> >> >> >> wants I
>>> >> >> >> would let them do so.
>>> >> >>
>>> >> >> I'm all for letting the users have the last word, but
this is one of
>>> >> >> those cases in which you don't know if they explicitly
want that or
>>> >> >> simply went with the defaults.
>>> >> >>
>>> >> >> Not a big problem as of course the important thing of
defaults is that
>>> >> >> "they work" but I'd really prefer the default
to try be the most
>>> >> >> appropriate encoding, which is not numeric in this case.
>>> >> >>
>>> >> >> Proposal: use numeric but still - rather than taking the
milliseconds
>>> >> >> from epoch, take the resulting number from YYYYMMDD ? It
might even be
>>> >> >> the most efficient encoding, as you don't have the
drawback of
>>> >> >> clustering which we would have with a numeric encoding
working on the
>>> >> >> individual fields, and doesn't have the bloat of string
encoding.
>>> >> >>
>>> >> >> >>
>>> >> >> >> +1
>>> >> >> >>
>>> >> >> >> > What do you suggest we do if a user maps the
following?
>>> >> >> >>
>>> >> >> >> > @DateBridge(resolution=MILLISECOND)
>>> >> >> >> > LocalDate birthday;
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> Nothing really,
>>> >> >> >> I would just consider a LocalDate the same as a
LocalDateTime with
>>> >> time
>>> >> >> >> 00:00:000 (UTC time zone)
>>> >> >>
>>> >> >> Ok that works but why write all those zeros in the index,
when you can
>>> >> >> just write the date. I realize storage is cheap, but still
we need to
>>> >> >> be careful as the index size affects performance ;-)
>>> >> >>
>>> >> >> Sanne
>>> >> >>
>>> >> >> >>
>>> >> >> >> It is equivalent to:
>>> >> >> >> LocalDateTime dateTime = date.atStartOfDay(
ZoneOffset.UTC );
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> On Wed, Aug 5, 2015 at 3:24 PM, Gunnar Morling
<gunnar(a)hibernate.org
>>> >> >
>>> >> >> >> wrote:
>>> >> >> >>
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>> 2015-08-05 12:41 GMT+02:00 Sanne Grinovero
<sanne(a)hibernate.org>:
>>> >> >> >>>
>>> >> >> >>>> Our current implementation converts Date
in the long "distance from
>>> >> >> >>>> epoch" to allow correct range-queries
treating each Date as an
>>> >> >> >>>> instant
>>> >> >> >>>> in time - allowing a universal sorting
strategy. But a LocalDate is
>>> >> >> >>>> not an instant-in-time.
>>> >> >> >>>>
>>> >> >> >>>> A LocalDate is intentionally oblivious of
the timezone; as the
>>> >> >> >>>> javadoc
>>> >> >> >>>> states, it's useful for birthdays,
i.e. symbolic occurrences and
>>> >> >> >>>> potentially legal matters which don't
fit into a universal sorting
>>> >> >> >>>> model but rather with the local political
scene - we would need the
>>> >> >> >>>> combo {LocalDate, ZoneId} provided to be
able to allow sorting
>>> >> across
>>> >> >> >>>> different LocalDate - or simply assume
that they are all referring
>>> >> to
>>> >> >> >>>> the same Zone.
>>> >> >> >>>>
>>> >> >> >>>
>>> >> >> >>> Right, I had the latter in mind and would use
UTC for that purpose.
>>> >> >> >>>
>>> >> >> >>>>
>>> >> >> >>>> I think that if the user is using a
LocalDate type, he's implicitly
>>> >> >> >>>> hinting that the timezone is not relevant
for the practical use
>>> >> >> >>>> (possibly even wrong); the most faithful
representation would be
>>> >> the
>>> >> >> >>>> string form in ISO standard format or to
encode the day,month,year
>>> >> as
>>> >> >> >>>> independent fields? This last detail
depends on how it would be
>>> >> more
>>> >> >> >>>> efficient to store & query; probably
the String format YYYYMMDD
>>> >> would
>>> >> >> >>>> be the most efficient internal
representation to allow also correct
>>> >> >> >>>> sorting.
>>> >> >> >>>>
>>> >> >> >>>> I wouldn't use NumericField(s) in this
case, as they are more
>>> >> >> >>>> effective only with larger ranges, while
MM and DD are very short;
>>> >> >> >>>> not
>>> >> >> >>>> sure if it's worth splitting the year
as a NumericField either, as
>>> >> >> >>>> the
>>> >> >> >>>> values will likely be strongly clustered
in the same range of
>>> >> "recent
>>> >> >> >>>> years" - although that might depend
on the application but it
>>> >> doesn't
>>> >> >> >>>> seem worth the complexity, so I'd
index & store as a String
>>> >> YYYYMMDD.
>>> >> >> >>>>
>>> >> >> >>>
>>> >> >> >>> Agreed that this makes most sense, given the
"symbolic" nature of
>>> >> >> >>> LocalDate.
>>> >> >> >>>
>>> >> >> >>> What would you do though in case of the
following:
>>> >> >> >>>
>>> >> >> >>> @DateBridge
>>> >> >> >>> LocalDate myDate;
>>> >> >> >>>
>>> >> >> >>> encoding() defaults to NUMERIC, so would you
a) raise an error, or
>>> >> b)
>>> >> >> >>> ignore encoding() for LocalDate and friends?
Both seem not right to
>>> >> >> >>> me. I
>>> >> >> >>> think there is nothing wrong with using
NUMERIC encoding per-se for
>>> >> >> >>> these
>>> >> >> >>> types. We may recommend STRING but if NUMERIC
really is what a user
>>> >> >> >>> wants I
>>> >> >> >>> would let them do so.
>>> >> >> >>>
>>> >> >> >>>>
>>> >> >> >>>> -- Sanne
>>> >> >> >>>>
>>> >> >> >>>>
>>> >> >> >>>> On 5 August 2015 at 11:10, Gunnar Morling
<gunnar(a)hibernate.org>
>>> >> >> >>>> wrote:
>>> >> >> >>>> > Hi,
>>> >> >> >>>> >
>>> >> >> >>>> > What's the motivation for using a
different representation in
>>> >> that
>>> >> >> >>>> case?
>>> >> >> >>>> >
>>> >> >> >>>> > For the sake of consistency, I'd
use milli seconds since
>>> >> 1970-01-01
>>> >> >> >>>> across
>>> >> >> >>>> > the board. Otherwise it'll be
more difficult to compare fields
>>> >> >> >>>> > created
>>> >> >> >>>> from
>>> >> >> >>>> > properties of different date types.
>>> >> >> >>>> >
>>> >> >> >>>> > --Gunnar
>>> >> >> >>>> >
>>> >> >> >>>> >
>>> >> >> >>>> > 2015-08-04 19:49 GMT+02:00 Davide
D'Alto <davide(a)hibernate.org>:
>>> >> >> >>>> >
>>> >> >> >>>> >> Hi,
>>> >> >> >>>> >> I started to work on the creation
of the bridges for the classes
>>> >> >> >>>> >> in
>>> >> >> >>>> the
>>> >> >> >>>> >> java.time package.
>>> >> >> >>>> >>
>>> >> >> >>>> >> I was wondering if we want to
convert the values to long using
>>> >> the
>>> >> >> >>>> existing
>>> >> >> >>>> >> approach we have now for
java.util.Date.
>>> >> >> >>>> >>
>>> >> >> >>>> >> In Hibernate Search a
java.util.Date is converted into a long
>>> >> that
>>> >> >> >>>> >> represents the number of
milliseconds since January 1, 1970,
>>> >> >> >>>> >> 00:00:00
>>> >> >> >>>> GMT
>>> >> >> >>>> >> using getTime().
>>> >> >> >>>> >>
>>> >> >> >>>> >> The same value can be obtain from
a java.time.LocaDate via:
>>> >> >> >>>> >>
>>> >> >> >>>> >> long epochMilli =
date.atStartOfDay( ZoneOffset.UTC
>>> >> >> >>>> >> ).toInstant().toEpochMilli();
>>> >> >> >>>> >>
>>> >> >> >>>> >> LocalDate has a method that
returns the same value expressed in
>>> >> >> >>>> number of
>>> >> >> >>>> >> days:
>>> >> >> >>>> >>
>>> >> >> >>>> >> long epochDay =
date.toEpochDay();
>>> >> >> >>>> >>
>>> >> >> >>>> >>
>>> >> >> >>>> >> I would use the second approach
>>> >> >> >>>> >>
>>> >> >> >>>> >> Davide
>>> >> >> >>>> >>
_______________________________________________
>>> >> >> >>>> >> hibernate-dev mailing list
>>> >> >> >>>> >> hibernate-dev(a)lists.jboss.org
>>> >> >> >>>> >>
https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>> >> >> >>>> >>
>>> >> >> >>>> >
_______________________________________________
>>> >> >> >>>> > hibernate-dev mailing list
>>> >> >> >>>> > hibernate-dev(a)lists.jboss.org
>>> >> >> >>>> >
https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>> >> >> >>>>
>>> >> >> >>>
>>> >> >> >>>
>>> >> >> >>
>>> >> >> > _______________________________________________
>>> >> >> > hibernate-dev mailing list
>>> >> >> > hibernate-dev(a)lists.jboss.org
>>> >> >> >
https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>> >> >> _______________________________________________
>>> >> >> hibernate-dev mailing list
>>> >> >> hibernate-dev(a)lists.jboss.org
>>> >> >>
https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>> >> >
>>> >> >
>>> >> _______________________________________________
>>> >> hibernate-dev mailing list
>>> >> hibernate-dev(a)lists.jboss.org
>>> >>
https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>> >>
>>> > _______________________________________________
>>> > hibernate-dev mailing list
>>> > hibernate-dev(a)lists.jboss.org
>>> >
https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>> _______________________________________________
>>> hibernate-dev mailing list
>>> hibernate-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/hibernate-dev
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/hibernate-dev