[
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-923?pag...
]
Emmanuel Bernard commented on HSEARCH-923:
------------------------------------------
Copy from hibernate-dev email
Nicolas and I have made good progress on Geospatial queries for Hibernate Search.
Our goal is to give a reasonable but pragmatic answer to geoloc queries. We do not try and
implement the most obscure geo-projection favored by ancient greeks, we do not try and
find matching elements within a triangular-shaped donut on Mars' surface etc. We have
purposely limited the current implementation to:
- find matching elements in a circle (we have plans to extends to matching elements in a
rectangle if popular demands arise but in our opinion this will not be useful or rather be
misleading)
- use the internationally accepted geo projection as it is i18n neutral and not centered
on one particular country. We can plan on opening to other projections if the need arise
(esp if data points are provided in different projections).
We made sure to expose as few gory details as possible.
That being said, here are more information and questions.
The JIRA is
https://hibernate.onjira.com/browse/HSEARCH-923
The branch is
https://github.com/emmanuelbernard/hibernate-search/tree/HSEARCH-923
h2. How is geoloc data exposed to the domain model?
We plan on supporting three approaches:
h3. Special interface and embeddable object
Using a specific interface as the property return type: `o.h.s.spatial.Coordinates`
{code}
@Indexed
public class Address {
@Field String city;
@Spatial Coordinates location = new Coordinates() {
public double getLatitude() { ... }
public double getLongitude() { ... }
}
}
{code}
h3. Special interface implemented by the entity
Using a specific interface implemented by the entity: `o.h.s.spatial.Coordinates`
{code}
@Indexed @Spatial
public class Address {
@Field String city;
public double getLatitude() { ... }
public double getLongitude() { ... }
}
{code}
h3. Use JTS's Point type
Use `Point` as the spatial property type.
h3. (maybe) `double` hosted by two unrelated properties
The problem is to find a nice way to bind these properties to the spatial data.
h2. How is geoloc data indexed
There will be two strategies
- index latitude and longitude as is and do a AND query matching both. This typically
works nicely for small datasets.
- index latitude and longitude as matching a 15 level grid (from most generic to most
specific). this typically works nicely for big datasets
h2. Query DSL
We have worked to make a fluent spatial API to the current query DSL. Before we go on
implementing it, we would like your feedback. Some points remains open.
h3. General overview
{code}
builder.spatial()
.scoreByProximity() //not implemented yet
.onField("coord")
.boostedTo(2)
.within(2).km()
.of( coordinates )
.createQuery();
{code}
h3. onField
onField is not a good name. It has a slightly meaning than when it's used in
range().onField(). We need to find a better name:
- onField
- onGrid
- onCoordinates
- onLocation
This really represents the metadata set where the location will be stored. In the boolean
approach, we store latitude and longitude. In the grid approach, we store latitude,
longitude and the set of grids coordinates belong to.
.onField() does accept a field name which can be the `Coordinates` property or the virtual
field used by the class-level bridge (if lat and long are top level properties).
When latitude and longitude are independent properties, we would use
{code}
builder.
.onLatitudeField("lat")
.andLongitudeField("lat")
{code}
h3. Surface checked
h4. Option 1: centeredOn
{code}
.centeredOn(double, double)
//or
.centeredOn()
.latitude(double)
.longitude(double)
//or
.centeredOn(SpatialIndexable)
.centeredOn(JTS.Point) // hard dependency on JTS even for non spatial users :(
.centeredOn(Object) //? to avoid JTS dep
{code}
- Should we have a version accepting Object?
- What is best, centeredOn(double, double) or
centeredOn().latitude(double).longitude(double)?
h4. Option 2: in / within
{code}
//query within circle
b.spatial()
.onField("coord")
.within(2).km()
.of(SpatialIndexable)
.within(2).km()
.of()
.latitude()
.longitude()
.createQuery()
//or with a different unit handling
//query within circle
b.spatial()
.onField("coord")
.within(2, Unit.km)
.of(SpatialIndexable)
.within(2, Unit.km)
.of()
.latitude()
.longitude()
.createQuery()
{code}
My reason to support units is that a. it's explicit and b. when those geosuckers
improve, we could support time units like mins or hours. Note, that's a very hard
problem to crack and solutions are resource intensive and not very accurate. None really
do it correctly, not Google for sure.
We could support rectangles / boxes if really needed
{code}
//query in box
b.spatial()
.onField("coord")
.inBox()
.from()
.to()
.createQuery();
//more formal but more correct wrt projection
b.spatial()
.onField("coord")
.inBox()
.withUpperLeft()
.withLowerRight()
.createQuery();
{code}
Please give us your feedback.
h2. TODOs
- Implement fluent DSL
- Implement Special interface implemented by the entity
- Implement Use JTS's Point type
- Implement bridge that supports indexing for boolean queries based on lat and long
instead of the grid.
- Implement @Spatial as a marker annotation for the spatial bridge
- Implement variable score based on proximity
Today we use constant score, ie in = 1, out = 0. We can think about a score that goes
from 1 to 0 based on the distance from the center to the circle size
We can imagine queries that should return close elements above far elements.
Note we might need a score going from 1 to .5 or some other value. Need to think about
that.
- Write how to focused doc
- Write doc on perf comparing grid vs boolean queries
- Convert to JBoss logging
- Add unit test using faceting + spatial queries
Add support for geospatial queries
----------------------------------
Key: HSEARCH-923
URL:
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-923
Project: Hibernate Search
Issue Type: New Feature
Components: query
Reporter: Emmanuel Bernard
Assignee: Nicolas Helleringer
--
This message is automatically generated by JIRA.
For more information on JIRA, see:
http://www.atlassian.com/software/jira