[hibernate-dev] HSEARCH Java 8 Date Time

Sanne Grinovero sanne at hibernate.org
Wed Aug 5 12:40:16 EDT 2015


On 5 August 2015 at 17:22, Davide D'Alto <davide at hibernate.org> wrote:
>> Proposal: use numeric but still - rather than taking the milliseconds
>> from epoch, take the resulting number from YYYYMMDD ?
>
> I don't think I understand what you mean with "the resulting number from
> YYYYMMDD".
> Wouldn't be similar to get the number of days from epoch?

No because epoch is a specific moment *with a timezone*. If you take a
calendar date "here", and take the moment in time which represents
your beginning of the calendar date, the distance from epoch is not a
whole number and you'd have to apply rounding which is timezone
specific.

By simply encoding the number in the above format, you'd encode today
as the number "20150805".
That's a whole number which avoids the timezone relativity and can be
efficiently encoded in numeric form, and provides the expected sorting
properties.

>
> But basically, you are saying that I can use different numeric encoding for
> different types. Isn't it?

Yes, you definitely need different encodings depending on the type and
the used options.

> So, for example:
>
> java.util.Date, java.util.Calendar  and java.time.Instant,
> java.time.LocalDateTime will use number of miliseconds from epoch
> java.time.LocalDate: number of days from epoch

Except this one ^ I agree with the others.

> java.time.LocalTime: number of nanos in a day

Conceptually, yes.. but we don't have "nanoseconds" as an option of
org.hibernate.search.annotations.Resolution. Should we add it?
We would not be able to apply that Resolution on old fashioned
Date/Calendar, so that would need a warning or even an exception when
applied to old style value types.

>> Ok that works but why write all those zeros in the index, when you can
>> just write the date. I realize storage is cheap, but still we need to
>> be careful as the index size affects performance ;-)
>
> I don't think we need to store the 0s.
> If I know the type of the field I already know the the time is 0.

Exactly

> Am I missing something?

I probably just misunderstood your proposal, since previously you
mentioned: "I would just consider a LocalDate the same as a
LocalDateTime with time 00:00:000 (UTC time zone)".
If you have to write the days only you don't need to convert to a time first.
This misunderstanding might be related with the fact that you were
planning to encode as distance from epoch.. see my first comment on
this same email.
Since you don't want to look at distance from epoch for this case, the
time component really is irrelevant and LocalDate has all the
information you need.. simpler ;)

Sanne


>
>
> On Wed, Aug 5, 2015 at 5:00 PM, Sanne Grinovero <sanne at hibernate.org> wrote:
>
>> On 5 August 2015 at 16:27, Gunnar Morling <gunnar at hibernate.org> wrote:
>> >> as I'd like us to consider not
>> > applying DateBridge on the new types as it doesn't seem to add much
>> > practical value.
>> >
>> > Ok, that may make sense for types such as LocalDate. But there are types
>> in
>> > the new API which - unlike LocalDate - do describe an exact instant on
>> the
>> > time line (e.g. ZonedDateTime, Instant). For those IMO it makes sense for
>> > sure to support both encodings, NUMERIC and STRING (similar to
>> Date/Calendar
>> > so far) and thus apply @DateBridge.
>>
>> +1
>>
>> > Question is whether/how to index/persist TZ information, for Calendar it
>> > seems not been persisted in the index so far?
>>
>> It's encoding the Calendar's time as distance from epoch, which is a
>> neutral encoding so you don't need the TZ.
>>
>> For the old style Date/Calendar types we always assumed the value was
>> a point-in-time, unless explicitly opting in for an alternative
>> encoding.
>> For example for the "birthday use case" a reasonable setting would
>> have been String encoding with resolution=DAY, although passing in a
>> Date instance having the right value (as in right timezone) would have
>> been user's responsibility.. we simply take the long it's storing and
>> index that with the requested resolution.
>>
>> Sanne
>>
>> >
>> >
>> > 2015-08-05 17:10 GMT+02:00 Sanne Grinovero <sanne at hibernate.org>:
>> >>
>> >> Inline:
>> >>
>> >> On 5 August 2015 at 15:42, Davide D'Alto <davide at hibernate.org> wrote:
>> >> > If a user select a resolution that does not make much sense we can
>> log a
>> >> > warning.
>> >>
>> >> +1 And update the javadoc to mention that some resolution values don't
>> >> apply
>> >>
>> >> > But I think this might make sense:
>> >> >
>> >> >    @DateBridge(resolution=MONTH)
>> >> >    LocalDate birthday;
>> >>
>> >> Ok but how often do you think that will be used?
>> >> Sorry playing devil's advocate here, as I'd like us to consider not
>> >> applying DateBridge on the new types as it doesn't seem to add much
>> >> practical value.
>> >>
>> >> I agree it's worth a shot, but while going ahead keep in mind that
>> >> maybe simplifying that is the more elegant solution.
>> >>
>> >> > On Wed, Aug 5, 2015 at 3:37 PM, Davide D'Alto <davide at hibernate.org>
>> >> > wrote:
>> >> >
>> >> >> > What would you do though in case of the following:
>> >> >> >
>> >> >> >   @DateBridge
>> >> >> >    LocalDate myDate;
>> >> >> >
>> >> >> > encoding() defaults to NUMERIC, so would you a) raise an error, or
>> b)
>> >> >> ignore encoding() for LocalDate and friends? Both seem not right to
>> me.
>> >> >> I
>> >> >> think there is nothing wrong with using NUMERIC encoding per-se for
>> >> >> these
>> >> >> types. We may recommend STRING but if NUMERIC really is what a user
>> >> >> wants I
>> >> >> would let them do so.
>> >>
>> >> I'm all for letting the users have the last word, but this is one of
>> >> those cases in which you don't know if they explicitly want that or
>> >> simply went with the defaults.
>> >>
>> >> Not a big problem as of course the important thing of defaults is that
>> >> "they work" but I'd really prefer the default to try be the most
>> >> appropriate encoding, which is not numeric in this case.
>> >>
>> >> Proposal: use numeric but still - rather than taking the milliseconds
>> >> from epoch, take the resulting number from YYYYMMDD ? It might even be
>> >> the most efficient encoding, as you don't have the drawback of
>> >> clustering which we would have with a numeric encoding working on the
>> >> individual fields, and doesn't have the bloat of string encoding.
>> >>
>> >> >>
>> >> >> +1
>> >> >>
>> >> >> > What do you suggest we do if a user maps the following?
>> >> >>
>> >> >> >   @DateBridge(resolution=MILLISECOND)
>> >> >> >   LocalDate birthday;
>> >> >>
>> >> >>
>> >> >> Nothing really,
>> >> >> I would just consider a LocalDate the same as a LocalDateTime with
>> time
>> >> >> 00:00:000 (UTC time zone)
>> >>
>> >> Ok that works but why write all those zeros in the index, when you can
>> >> just write the date. I realize storage is cheap, but still we need to
>> >> be careful as the index size affects performance ;-)
>> >>
>> >> Sanne
>> >>
>> >> >>
>> >> >> It is equivalent to:
>> >> >> LocalDateTime dateTime = date.atStartOfDay( ZoneOffset.UTC );
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Wed, Aug 5, 2015 at 3:24 PM, Gunnar Morling <gunnar at hibernate.org
>> >
>> >> >> wrote:
>> >> >>
>> >> >>>
>> >> >>>
>> >> >>> 2015-08-05 12:41 GMT+02:00 Sanne Grinovero <sanne at hibernate.org>:
>> >> >>>
>> >> >>>> Our current implementation converts Date in the long "distance from
>> >> >>>> epoch" to allow correct range-queries treating each Date as an
>> >> >>>> instant
>> >> >>>> in time - allowing a universal sorting strategy. But a LocalDate is
>> >> >>>> not an instant-in-time.
>> >> >>>>
>> >> >>>> A LocalDate is intentionally oblivious of the timezone; as the
>> >> >>>> javadoc
>> >> >>>> states, it's useful for birthdays, i.e. symbolic occurrences and
>> >> >>>> potentially legal matters which don't fit into a universal sorting
>> >> >>>> model but rather with the local political scene - we would need the
>> >> >>>> combo {LocalDate, ZoneId} provided to be able to allow sorting
>> across
>> >> >>>> different LocalDate - or simply assume that they are all referring
>> to
>> >> >>>> the same Zone.
>> >> >>>>
>> >> >>>
>> >> >>> Right, I had the latter in mind and would use UTC for that purpose.
>> >> >>>
>> >> >>>>
>> >> >>>> I think that if the user is using a LocalDate type, he's implicitly
>> >> >>>> hinting that the timezone is not relevant for the practical use
>> >> >>>> (possibly even wrong); the most faithful representation would be
>> the
>> >> >>>> string form in ISO standard format or to encode the day,month,year
>> as
>> >> >>>> independent fields? This last detail depends on how it would be
>> more
>> >> >>>> efficient to store & query; probably the String format YYYYMMDD
>> would
>> >> >>>> be the most efficient internal representation to allow also correct
>> >> >>>> sorting.
>> >> >>>>
>> >> >>>> I wouldn't use NumericField(s) in this case, as they are more
>> >> >>>> effective only with larger ranges, while MM and DD are very short;
>> >> >>>> not
>> >> >>>> sure if it's worth splitting the year as a NumericField either, as
>> >> >>>> the
>> >> >>>> values will likely be strongly clustered in the same range of
>> "recent
>> >> >>>> years" - although that might depend on the application but it
>> doesn't
>> >> >>>> seem worth the complexity, so I'd index & store as a String
>> YYYYMMDD.
>> >> >>>>
>> >> >>>
>> >> >>> Agreed that this makes most sense, given the "symbolic" nature of
>> >> >>> LocalDate.
>> >> >>>
>> >> >>> What would you do though in case of the following:
>> >> >>>
>> >> >>>     @DateBridge
>> >> >>>     LocalDate myDate;
>> >> >>>
>> >> >>> encoding() defaults to NUMERIC, so would you a) raise an error, or
>> b)
>> >> >>> ignore encoding() for LocalDate and friends? Both seem not right to
>> >> >>> me. I
>> >> >>> think there is nothing wrong with using NUMERIC encoding per-se for
>> >> >>> these
>> >> >>> types. We may recommend STRING but if NUMERIC really is what a user
>> >> >>> wants I
>> >> >>> would let them do so.
>> >> >>>
>> >> >>>>
>> >> >>>> -- Sanne
>> >> >>>>
>> >> >>>>
>> >> >>>> On 5 August 2015 at 11:10, Gunnar Morling <gunnar at hibernate.org>
>> >> >>>> wrote:
>> >> >>>> > Hi,
>> >> >>>> >
>> >> >>>> > What's the motivation for using a different representation in
>> that
>> >> >>>> case?
>> >> >>>> >
>> >> >>>> > For the sake of consistency, I'd use milli seconds since
>> 1970-01-01
>> >> >>>> across
>> >> >>>> > the board. Otherwise it'll be more difficult to compare fields
>> >> >>>> > created
>> >> >>>> from
>> >> >>>> > properties of different date types.
>> >> >>>> >
>> >> >>>> > --Gunnar
>> >> >>>> >
>> >> >>>> >
>> >> >>>> > 2015-08-04 19:49 GMT+02:00 Davide D'Alto <davide at hibernate.org>:
>> >> >>>> >
>> >> >>>> >> Hi,
>> >> >>>> >> I started to work on the creation of the bridges for the classes
>> >> >>>> >> in
>> >> >>>> the
>> >> >>>> >> java.time package.
>> >> >>>> >>
>> >> >>>> >> I was wondering if we want to convert the values to long using
>> the
>> >> >>>> existing
>> >> >>>> >> approach we have now for java.util.Date.
>> >> >>>> >>
>> >> >>>> >> In Hibernate Search a java.util.Date is converted into a long
>> that
>> >> >>>> >> represents the number of milliseconds since January 1, 1970,
>> >> >>>> >> 00:00:00
>> >> >>>> GMT
>> >> >>>> >> using getTime().
>> >> >>>> >>
>> >> >>>> >> The same value can be obtain from a java.time.LocaDate via:
>> >> >>>> >>
>> >> >>>> >>         long epochMilli = date.atStartOfDay( ZoneOffset.UTC
>> >> >>>> >> ).toInstant().toEpochMilli();
>> >> >>>> >>
>> >> >>>> >> LocalDate has a method that returns the same value expressed in
>> >> >>>> number of
>> >> >>>> >> days:
>> >> >>>> >>
>> >> >>>> >>         long epochDay = date.toEpochDay();
>> >> >>>> >>
>> >> >>>> >>
>> >> >>>> >> I would use the second approach
>> >> >>>> >>
>> >> >>>> >> Davide
>> >> >>>> >> _______________________________________________
>> >> >>>> >> hibernate-dev mailing list
>> >> >>>> >> hibernate-dev at lists.jboss.org
>> >> >>>> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>> >> >>>> >>
>> >> >>>> > _______________________________________________
>> >> >>>> > hibernate-dev mailing list
>> >> >>>> > hibernate-dev at lists.jboss.org
>> >> >>>> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >>
>> >> > _______________________________________________
>> >> > hibernate-dev mailing list
>> >> > hibernate-dev at lists.jboss.org
>> >> > https://lists.jboss.org/mailman/listinfo/hibernate-dev
>> >> _______________________________________________
>> >> hibernate-dev mailing list
>> >> hibernate-dev at lists.jboss.org
>> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>> >
>> >
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev


More information about the hibernate-dev mailing list