The reason for my comment is that the code is doing a conversion to
radians in the DistanceFilter, which needs to be extremely efficient
as it's not only applied on the resultset but potentially on the whole
corpus of all Documents in the index.
So even if it's true that conversion would be needed on the final
results, we always expect people to retrieve only a limited amount of
entities (like with pagination), while the index might need to perform
this computation millions of times per query.
If I look at the complexity of Point.getDistanceTo(double, double), I
get a feeling that that method will hardly provide speedy queries
because of the complex computations in it - this is just speculation
at this point of course, to be sure we'd need to compare them with a
large enough dataset, but it seems quite obvious that storing
normalized radians should be more efficient as it would avoid a good
deal of math to be executed on each Document in the index.
Also if we assume people might want to use radians in their user data
(I know some who definitely would never touch decimals for such a use
case), there would be no need at all to convert the end result.
Some more thoughts inline:
On 3 May 2012 09:12, Nicolas Helleringer <nicolas.helleringer(a)gmail.com> wrote:
Hi all,
Sanne and I have been wondering about the way the spatial
branch/module/functionality for Hibernate Search shall store its
coordinates in the Lucene index.
Today it is implemented with decimal degree for :
- easy debugging/readability
- ease of conversion on storage as we want to accept mainly decimal degree
from users data
Valid points, but consider that "storage" is going to be way slower
anyway, and typically you'll process a Document to evaluate it for a
hit many many orders of magnitude more frequently than the times you
store it.
Sanne pointed out that when the search is done there is quite a few
conversion to radians for distance calculation and suggested that we may
store directly coordinates under their radians form.
I have tried a patch to implement this and as I was coding it I feel that
the code was less readable, in the coordinates normalisation mainly and
that there was as many conversion as before.
Conversions had moved from search to import / export of coordinates in and
out the spatial module scope to user scope.
I'm sure the amount of points in the code in which they are converted
won't change. I'm concerned about the cardinality of the collections
on which it's applied ;)
"Less readable" isn't nice, but we can work on that I guess?
What the docs does not tell (yet), is that we are waiting for WGS 84 (this
is a coordinate system) decimal degree coordinates input, as these are
quite a de facto standard (GPS output this way).
How does it affect this?
Today this is not the purpose of Hibernate Search spatial initiative to
handle projections. There are opensource libs to handle that on user side
very well (Proj4j)
So. The question is : shall we store as radians or decimal degree ?
Niko
P.S : Hope it is clear. If not ask for more.
Thanks!
Sanne