[hibernate-dev] Coordinates storage in Lucene index for spatial functionality

Nicolas Helleringer nicolas.helleringer at gmail.com
Thu May 3 05:13:53 EDT 2012


2012/5/3 Sanne Grinovero <sanne at hibernate.org>

> The reason for my comment is that the code is doing a conversion to
> radians in the DistanceFilter, which needs to be extremely efficient
> as it's not only applied on the resultset but potentially on the whole
> corpus of all Documents in the index.
> So even if it's true that conversion would be needed on the final
> results, we always expect people to retrieve only a limited amount of
> entities (like with pagination), while the index might need to perform
> this computation millions of times per query.
>
You re right.


> If I look at the complexity of Point.getDistanceTo(double, double), I
> get a feeling that that method will hardly provide speedy queries
> because of the complex computations in it - this is just speculation
> at this point of course, to be sure we'd need to compare them with a
> large enough dataset, but it seems quite obvious that storing
> normalized radians should be more efficient as it would avoid a good
> deal of math to be executed on each Document in the index.
>
Radians storage saves 5 double multiplication in a method with 2 sinus, 2
power, 2 cosinus, 2 sqrt and 1 atan2.
IMO it hardly makes a mark but yes I do agree it is a gain.


> Also if we assume people might want to use radians in their user data
> (I know some who definitely would never touch decimals for such a use
> case), there would be no need at all to convert the end result.
>
Right


>
> Some more thoughts inline:
>
> On 3 May 2012 09:12, Nicolas Helleringer <nicolas.helleringer at gmail.com>
> wrote:
> > Hi all,
> >
> > Sanne and I have been wondering about the way the spatial
> > branch/module/functionality for Hibernate Search shall store its
> > coordinates in the Lucene index.
> >
> > Today it is implemented with decimal degree for :
> > - easy debugging/readability
> > - ease of conversion on storage as we want to accept mainly decimal
> degree
> > from users data
>
> Valid points, but consider that "storage" is going to be way slower
> anyway, and typically you'll process a Document to evaluate it for a
> hit many many orders of magnitude more frequently than the times you
> store it.

RIght


> >
> > Sanne pointed out that when the search is done there is quite a few
> > conversion to radians for distance calculation and suggested that we may
> > store directly coordinates under their radians form.
> >
> > I have tried a patch to implement this and as I was coding it I feel that
> > the code was less readable, in the coordinates normalisation mainly and
> > that there was as many conversion as before.
> > Conversions had moved from search to import / export of coordinates in
> and
> > out the spatial module scope to user scope.
>
> I'm sure the amount of points in the code in which they are converted
> won't change. I'm concerned about the cardinality of the collections
> on which it's applied ;)
> "Less readable" isn't nice, but we can work on that I guess?
>
Right


>
> >
> > What the docs does not tell (yet), is that we are waiting for WGS 84
> (this
> > is a coordinate system) decimal degree coordinates input, as these are
> > quite a de facto standard (GPS output this way).
>
> How does it affect this?
>
Not at all. Should have been noted in my todo list. not in tha mail =)


>
> >
> > Today this is not the purpose of Hibernate Search spatial initiative to
> > handle projections. There are opensource libs to handle that on user side
> > very well (Proj4j)
> >
> > So. The question is : shall we store as radians or decimal degree ?
> >
> > Niko
> >
> > P.S : Hope it is clear. If not ask for more.
>
> Thanks!
> Sanne
>

I ll do a full patch ASAP.

Niko


More information about the hibernate-dev mailing list