[hibernate-dev] HSEARCH Java 8 Date Time

Davide D'Alto daltodavide at gmail.com
Mon Aug 10 07:34:33 EDT 2015


I'm not sure there is an easy way to convert Instant and DateTime to a
numeric value.
The problem is that the resolution for temporal types is nanoseconds, the
following datetime is valid:

year: -999.999.999
month: 12
day: 31
hour: 23
minute: 59
second: 59
nanos: 999.999.999

It gets more complicated when we need to store Offset or time zone.

> Great point, we should accept the user's domain type exclusively and
> take the conversion burden from the user; especially since we know the
> correct conversion strategy.

This is already supported (for certain types at least), you don't need to
round the dates to execute the queries.
I think the documentation is not up to date.

Davide

On Mon, Aug 10, 2015 at 11:37 AM, Sanne Grinovero <sanne at hibernate.org>
wrote:

> On 10 August 2015 at 11:04, Hardy Ferentschik <hardy at hibernate.org> wrote:
> > Hi,
> >
> > sorry,  I am late to the game, but I here are some more thoughts on this.
> >
> > I think the consensus so far is that
> >
> > # Date/time types which represent an instant in time are treated as
> usual.
> >   They can be string encoded (per default yyyyMMddHHmmssSSS) or
> numerically
> >   in which case the numeric long value equals the epoch time of the
> represented
> >   date.
>
> Correct that's the consensus so far. I'd like to challenge one more
> detail though:
> does it still make sense to allow string-encoded?
>
> I think not, we did allow it primarily because a long time ago that
> was the only way, then it became one of the options -but still the
> default - and more recently it became the non-default way.
>
> With these new types,backwards compatibility is a non-issue. So unless
> someone makes a strong case for needing these as String in the index,
> what about we drop some complexity?
>
> Remember:
>  - Hibernate Search is not an Objects/index mapper so we're not aiming
> at creating any index schema possible, we're aiming at taking
> advantage of the index for practical purposes ("I want it to be a
> string in the index" is not a valid argument - use your own
> fieldbridge in case)
>  - With Projections we have to re-transform things back into their
> Java original type, so how we encode things in the index is irrelevant
> from a semantics point of view; I think the only valid challenge would
> need to come from a performance or storage space perspective, in both
> cases I'm pretty sure the numeric encoding would win.
>
> > # Date/time types which do not represent an instant in time can also be
> >   encoded as string or number, but in the latter case the numeric
> representation
> >   is given by interpreting the string representation as number.
> >
> > So far so good. There are a couple of more things to think about.
> >
> > # Query time gets interesting and I think we need to improve the DSL in
> unison
> >   with adding support for these new types. Check out this example from
> DSLTest [1]
> >
> >                 query = monthQb
> >                                 .range()
> >                                         .onField( "estimatedCreation" )
> >                                                 .ignoreFieldBridge()
> >                                         .andField( "justfortest" )
> >
>  .ignoreFieldBridge().ignoreAnalyzer()
> >                                         .from( DateTools.round( from,
> DateTools.Resolution.MINUTE ) )
> >                                         .to( DateTools.round( to,
> DateTools.Resolution.MINUTE ) )
> >                                                 .excludeLimit()
> >                                         .createQuery();
> >
> > If a date is numerically encoded you need to specify numbers for the
> from and to values. ATM,
> > we recommend to use the Lucene specific DateTools to get the numeric
> representation. With the support
> > ofthe new date types things will get confusing for the user. How does
> one "create" the numeric representation
> > of a LocalDate (and how does one know how it looks like in the first
> place and how it differs from the epoch time)?
>
> Great point, we should accept the user's domain type exclusively and
> take the conversion burden from the user; especially since we know the
> correct conversion strategy.
>
> > We have been discussing before whether Hibernate Search needs to offer
> its own version of DateTools.
> > I think it would be time to do so and include helpers for the new
> date/time types. This also reduces the exposure
> > to Lucene specific types.
>
> +1 to encapsulate it, but I don't expect people to need it at all in
> the above case? But good for other more advanced needs.
>
> >
> > Even better though would be, if we would be able to support directly the
> use of date types in the from and to clauses.
> > It would be the responsibility of the DSL to round the specified types
> to the appropriate level based on the field's
> > configuration/metadata. Even in this scenario though a Search specific
> DateTools might be necessary for the cases
> > where the date specified in to/from needs to be rounded differently than
> the field itself.
>
> +1
>
> > Last but not least, the documentation needs to be updated. At the
> moment, the docs are silent about all the complexity
> > around dates. With the support of the new types, the docs needs to be
> more explicit and describe the subtleties at play.
>
> +1 created HSEARCH-1958
>
> Thanks,
> Sanne
>
>
> >
> > --Hardy
> >
> >
> > On Wed, Aug 05, 2015 at 05:40:16PM +0100, Sanne Grinovero wrote:
> >> On 5 August 2015 at 17:22, Davide D'Alto <davide at hibernate.org> wrote:
> >> >> Proposal: use numeric but still - rather than taking the milliseconds
> >> >> from epoch, take the resulting number from YYYYMMDD ?
> >> >
> >> > I don't think I understand what you mean with "the resulting number
> from
> >> > YYYYMMDD".
> >> > Wouldn't be similar to get the number of days from epoch?
> >>
> >> No because epoch is a specific moment *with a timezone*. If you take a
> >> calendar date "here", and take the moment in time which represents
> >> your beginning of the calendar date, the distance from epoch is not a
> >> whole number and you'd have to apply rounding which is timezone
> >> specific.
> >>
> >> By simply encoding the number in the above format, you'd encode today
> >> as the number "20150805".
> >> That's a whole number which avoids the timezone relativity and can be
> >> efficiently encoded in numeric form, and provides the expected sorting
> >> properties.
> >>
> >> >
> >> > But basically, you are saying that I can use different numeric
> encoding for
> >> > different types. Isn't it?
> >>
> >> Yes, you definitely need different encodings depending on the type and
> >> the used options.
> >>
> >> > So, for example:
> >> >
> >> > java.util.Date, java.util.Calendar  and java.time.Instant,
> >> > java.time.LocalDateTime will use number of miliseconds from epoch
> >> > java.time.LocalDate: number of days from epoch
> >>
> >> Except this one ^ I agree with the others.
> >>
> >> > java.time.LocalTime: number of nanos in a day
> >>
> >> Conceptually, yes.. but we don't have "nanoseconds" as an option of
> >> org.hibernate.search.annotations.Resolution. Should we add it?
> >> We would not be able to apply that Resolution on old fashioned
> >> Date/Calendar, so that would need a warning or even an exception when
> >> applied to old style value types.
> >>
> >> >> Ok that works but why write all those zeros in the index, when you
> can
> >> >> just write the date. I realize storage is cheap, but still we need to
> >> >> be careful as the index size affects performance ;-)
> >> >
> >> > I don't think we need to store the 0s.
> >> > If I know the type of the field I already know the the time is 0.
> >>
> >> Exactly
> >>
> >> > Am I missing something?
> >>
> >> I probably just misunderstood your proposal, since previously you
> >> mentioned: "I would just consider a LocalDate the same as a
> >> LocalDateTime with time 00:00:000 (UTC time zone)".
> >> If you have to write the days only you don't need to convert to a time
> first.
> >> This misunderstanding might be related with the fact that you were
> >> planning to encode as distance from epoch.. see my first comment on
> >> this same email.
> >> Since you don't want to look at distance from epoch for this case, the
> >> time component really is irrelevant and LocalDate has all the
> >> information you need.. simpler ;)
> >>
> >> Sanne
> >>
> >>
> >> >
> >> >
> >> > On Wed, Aug 5, 2015 at 5:00 PM, Sanne Grinovero <sanne at hibernate.org>
> wrote:
> >> >
> >> >> On 5 August 2015 at 16:27, Gunnar Morling <gunnar at hibernate.org>
> wrote:
> >> >> >> as I'd like us to consider not
> >> >> > applying DateBridge on the new types as it doesn't seem to add much
> >> >> > practical value.
> >> >> >
> >> >> > Ok, that may make sense for types such as LocalDate. But there are
> types
> >> >> in
> >> >> > the new API which - unlike LocalDate - do describe an exact
> instant on
> >> >> the
> >> >> > time line (e.g. ZonedDateTime, Instant). For those IMO it makes
> sense for
> >> >> > sure to support both encodings, NUMERIC and STRING (similar to
> >> >> Date/Calendar
> >> >> > so far) and thus apply @DateBridge.
> >> >>
> >> >> +1
> >> >>
> >> >> > Question is whether/how to index/persist TZ information, for
> Calendar it
> >> >> > seems not been persisted in the index so far?
> >> >>
> >> >> It's encoding the Calendar's time as distance from epoch, which is a
> >> >> neutral encoding so you don't need the TZ.
> >> >>
> >> >> For the old style Date/Calendar types we always assumed the value was
> >> >> a point-in-time, unless explicitly opting in for an alternative
> >> >> encoding.
> >> >> For example for the "birthday use case" a reasonable setting would
> >> >> have been String encoding with resolution=DAY, although passing in a
> >> >> Date instance having the right value (as in right timezone) would
> have
> >> >> been user's responsibility.. we simply take the long it's storing and
> >> >> index that with the requested resolution.
> >> >>
> >> >> Sanne
> >> >>
> >> >> >
> >> >> >
> >> >> > 2015-08-05 17:10 GMT+02:00 Sanne Grinovero <sanne at hibernate.org>:
> >> >> >>
> >> >> >> Inline:
> >> >> >>
> >> >> >> On 5 August 2015 at 15:42, Davide D'Alto <davide at hibernate.org>
> wrote:
> >> >> >> > If a user select a resolution that does not make much sense we
> can
> >> >> log a
> >> >> >> > warning.
> >> >> >>
> >> >> >> +1 And update the javadoc to mention that some resolution values
> don't
> >> >> >> apply
> >> >> >>
> >> >> >> > But I think this might make sense:
> >> >> >> >
> >> >> >> >    @DateBridge(resolution=MONTH)
> >> >> >> >    LocalDate birthday;
> >> >> >>
> >> >> >> Ok but how often do you think that will be used?
> >> >> >> Sorry playing devil's advocate here, as I'd like us to consider
> not
> >> >> >> applying DateBridge on the new types as it doesn't seem to add
> much
> >> >> >> practical value.
> >> >> >>
> >> >> >> I agree it's worth a shot, but while going ahead keep in mind that
> >> >> >> maybe simplifying that is the more elegant solution.
> >> >> >>
> >> >> >> > On Wed, Aug 5, 2015 at 3:37 PM, Davide D'Alto <
> davide at hibernate.org>
> >> >> >> > wrote:
> >> >> >> >
> >> >> >> >> > What would you do though in case of the following:
> >> >> >> >> >
> >> >> >> >> >   @DateBridge
> >> >> >> >> >    LocalDate myDate;
> >> >> >> >> >
> >> >> >> >> > encoding() defaults to NUMERIC, so would you a) raise an
> error, or
> >> >> b)
> >> >> >> >> ignore encoding() for LocalDate and friends? Both seem not
> right to
> >> >> me.
> >> >> >> >> I
> >> >> >> >> think there is nothing wrong with using NUMERIC encoding
> per-se for
> >> >> >> >> these
> >> >> >> >> types. We may recommend STRING but if NUMERIC really is what a
> user
> >> >> >> >> wants I
> >> >> >> >> would let them do so.
> >> >> >>
> >> >> >> I'm all for letting the users have the last word, but this is one
> of
> >> >> >> those cases in which you don't know if they explicitly want that
> or
> >> >> >> simply went with the defaults.
> >> >> >>
> >> >> >> Not a big problem as of course the important thing of defaults is
> that
> >> >> >> "they work" but I'd really prefer the default to try be the most
> >> >> >> appropriate encoding, which is not numeric in this case.
> >> >> >>
> >> >> >> Proposal: use numeric but still - rather than taking the
> milliseconds
> >> >> >> from epoch, take the resulting number from YYYYMMDD ? It might
> even be
> >> >> >> the most efficient encoding, as you don't have the drawback of
> >> >> >> clustering which we would have with a numeric encoding working on
> the
> >> >> >> individual fields, and doesn't have the bloat of string encoding.
> >> >> >>
> >> >> >> >>
> >> >> >> >> +1
> >> >> >> >>
> >> >> >> >> > What do you suggest we do if a user maps the following?
> >> >> >> >>
> >> >> >> >> >   @DateBridge(resolution=MILLISECOND)
> >> >> >> >> >   LocalDate birthday;
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> Nothing really,
> >> >> >> >> I would just consider a LocalDate the same as a LocalDateTime
> with
> >> >> time
> >> >> >> >> 00:00:000 (UTC time zone)
> >> >> >>
> >> >> >> Ok that works but why write all those zeros in the index, when
> you can
> >> >> >> just write the date. I realize storage is cheap, but still we
> need to
> >> >> >> be careful as the index size affects performance ;-)
> >> >> >>
> >> >> >> Sanne
> >> >> >>
> >> >> >> >>
> >> >> >> >> It is equivalent to:
> >> >> >> >> LocalDateTime dateTime = date.atStartOfDay( ZoneOffset.UTC );
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> On Wed, Aug 5, 2015 at 3:24 PM, Gunnar Morling <
> gunnar at hibernate.org
> >> >> >
> >> >> >> >> wrote:
> >> >> >> >>
> >> >> >> >>>
> >> >> >> >>>
> >> >> >> >>> 2015-08-05 12:41 GMT+02:00 Sanne Grinovero <
> sanne at hibernate.org>:
> >> >> >> >>>
> >> >> >> >>>> Our current implementation converts Date in the long
> "distance from
> >> >> >> >>>> epoch" to allow correct range-queries treating each Date as
> an
> >> >> >> >>>> instant
> >> >> >> >>>> in time - allowing a universal sorting strategy. But a
> LocalDate is
> >> >> >> >>>> not an instant-in-time.
> >> >> >> >>>>
> >> >> >> >>>> A LocalDate is intentionally oblivious of the timezone; as
> the
> >> >> >> >>>> javadoc
> >> >> >> >>>> states, it's useful for birthdays, i.e. symbolic occurrences
> and
> >> >> >> >>>> potentially legal matters which don't fit into a universal
> sorting
> >> >> >> >>>> model but rather with the local political scene - we would
> need the
> >> >> >> >>>> combo {LocalDate, ZoneId} provided to be able to allow
> sorting
> >> >> across
> >> >> >> >>>> different LocalDate - or simply assume that they are all
> referring
> >> >> to
> >> >> >> >>>> the same Zone.
> >> >> >> >>>>
> >> >> >> >>>
> >> >> >> >>> Right, I had the latter in mind and would use UTC for that
> purpose.
> >> >> >> >>>
> >> >> >> >>>>
> >> >> >> >>>> I think that if the user is using a LocalDate type, he's
> implicitly
> >> >> >> >>>> hinting that the timezone is not relevant for the practical
> use
> >> >> >> >>>> (possibly even wrong); the most faithful representation
> would be
> >> >> the
> >> >> >> >>>> string form in ISO standard format or to encode the
> day,month,year
> >> >> as
> >> >> >> >>>> independent fields? This last detail depends on how it would
> be
> >> >> more
> >> >> >> >>>> efficient to store & query; probably the String format
> YYYYMMDD
> >> >> would
> >> >> >> >>>> be the most efficient internal representation to allow also
> correct
> >> >> >> >>>> sorting.
> >> >> >> >>>>
> >> >> >> >>>> I wouldn't use NumericField(s) in this case, as they are more
> >> >> >> >>>> effective only with larger ranges, while MM and DD are very
> short;
> >> >> >> >>>> not
> >> >> >> >>>> sure if it's worth splitting the year as a NumericField
> either, as
> >> >> >> >>>> the
> >> >> >> >>>> values will likely be strongly clustered in the same range of
> >> >> "recent
> >> >> >> >>>> years" - although that might depend on the application but it
> >> >> doesn't
> >> >> >> >>>> seem worth the complexity, so I'd index & store as a String
> >> >> YYYYMMDD.
> >> >> >> >>>>
> >> >> >> >>>
> >> >> >> >>> Agreed that this makes most sense, given the "symbolic"
> nature of
> >> >> >> >>> LocalDate.
> >> >> >> >>>
> >> >> >> >>> What would you do though in case of the following:
> >> >> >> >>>
> >> >> >> >>>     @DateBridge
> >> >> >> >>>     LocalDate myDate;
> >> >> >> >>>
> >> >> >> >>> encoding() defaults to NUMERIC, so would you a) raise an
> error, or
> >> >> b)
> >> >> >> >>> ignore encoding() for LocalDate and friends? Both seem not
> right to
> >> >> >> >>> me. I
> >> >> >> >>> think there is nothing wrong with using NUMERIC encoding
> per-se for
> >> >> >> >>> these
> >> >> >> >>> types. We may recommend STRING but if NUMERIC really is what
> a user
> >> >> >> >>> wants I
> >> >> >> >>> would let them do so.
> >> >> >> >>>
> >> >> >> >>>>
> >> >> >> >>>> -- Sanne
> >> >> >> >>>>
> >> >> >> >>>>
> >> >> >> >>>> On 5 August 2015 at 11:10, Gunnar Morling <
> gunnar at hibernate.org>
> >> >> >> >>>> wrote:
> >> >> >> >>>> > Hi,
> >> >> >> >>>> >
> >> >> >> >>>> > What's the motivation for using a different representation
> in
> >> >> that
> >> >> >> >>>> case?
> >> >> >> >>>> >
> >> >> >> >>>> > For the sake of consistency, I'd use milli seconds since
> >> >> 1970-01-01
> >> >> >> >>>> across
> >> >> >> >>>> > the board. Otherwise it'll be more difficult to compare
> fields
> >> >> >> >>>> > created
> >> >> >> >>>> from
> >> >> >> >>>> > properties of different date types.
> >> >> >> >>>> >
> >> >> >> >>>> > --Gunnar
> >> >> >> >>>> >
> >> >> >> >>>> >
> >> >> >> >>>> > 2015-08-04 19:49 GMT+02:00 Davide D'Alto <
> davide at hibernate.org>:
> >> >> >> >>>> >
> >> >> >> >>>> >> Hi,
> >> >> >> >>>> >> I started to work on the creation of the bridges for the
> classes
> >> >> >> >>>> >> in
> >> >> >> >>>> the
> >> >> >> >>>> >> java.time package.
> >> >> >> >>>> >>
> >> >> >> >>>> >> I was wondering if we want to convert the values to long
> using
> >> >> the
> >> >> >> >>>> existing
> >> >> >> >>>> >> approach we have now for java.util.Date.
> >> >> >> >>>> >>
> >> >> >> >>>> >> In Hibernate Search a java.util.Date is converted into a
> long
> >> >> that
> >> >> >> >>>> >> represents the number of milliseconds since January 1,
> 1970,
> >> >> >> >>>> >> 00:00:00
> >> >> >> >>>> GMT
> >> >> >> >>>> >> using getTime().
> >> >> >> >>>> >>
> >> >> >> >>>> >> The same value can be obtain from a java.time.LocaDate
> via:
> >> >> >> >>>> >>
> >> >> >> >>>> >>         long epochMilli = date.atStartOfDay(
> ZoneOffset.UTC
> >> >> >> >>>> >> ).toInstant().toEpochMilli();
> >> >> >> >>>> >>
> >> >> >> >>>> >> LocalDate has a method that returns the same value
> expressed in
> >> >> >> >>>> number of
> >> >> >> >>>> >> days:
> >> >> >> >>>> >>
> >> >> >> >>>> >>         long epochDay = date.toEpochDay();
> >> >> >> >>>> >>
> >> >> >> >>>> >>
> >> >> >> >>>> >> I would use the second approach
> >> >> >> >>>> >>
> >> >> >> >>>> >> Davide
> >> >> >> >>>> >> _______________________________________________
> >> >> >> >>>> >> hibernate-dev mailing list
> >> >> >> >>>> >> hibernate-dev at lists.jboss.org
> >> >> >> >>>> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >> >> >> >>>> >>
> >> >> >> >>>> > _______________________________________________
> >> >> >> >>>> > hibernate-dev mailing list
> >> >> >> >>>> > hibernate-dev at lists.jboss.org
> >> >> >> >>>> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >> >> >> >>>>
> >> >> >> >>>
> >> >> >> >>>
> >> >> >> >>
> >> >> >> > _______________________________________________
> >> >> >> > hibernate-dev mailing list
> >> >> >> > hibernate-dev at lists.jboss.org
> >> >> >> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >> >> >> _______________________________________________
> >> >> >> hibernate-dev mailing list
> >> >> >> hibernate-dev at lists.jboss.org
> >> >> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >> >> >
> >> >> >
> >> >> _______________________________________________
> >> >> hibernate-dev mailing list
> >> >> hibernate-dev at lists.jboss.org
> >> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >> >>
> >> > _______________________________________________
> >> > hibernate-dev mailing list
> >> > hibernate-dev at lists.jboss.org
> >> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >> _______________________________________________
> >> hibernate-dev mailing list
> >> hibernate-dev at lists.jboss.org
> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>


More information about the hibernate-dev mailing list