[hibernate-dev] [HSEARCH] Geospatial indexing and queries

Karel Maesen karel at geovise.com
Wed Dec 21 09:30:56 EST 2011


Hi Emmanuel,

My preference:

.forLocation("location")
.forLatitudeField("lat").forLongitudeField("long")

Btw, I quite like "LatLong" as an alternative for the term "Coordinates" or "Location". It makes clear what the meaning is of the coordinates/location interface/method, and conditions the reader as to the order of latitude and longitude. It is also used in the Google Maps JS API (well almost, the use the imo uglier "LatLng").

Regards,

Karel

On 21 Dec 2011, at 14:59, Emmanuel Bernard wrote:

> Which one do you prefer (line in bold for each sample)?
> 
> 			builder
> 				.spatial()
> 					.onCoordinates( "location" )  or .forCoordinates("location") or .forLocation("location") or something else
> 					.within( 500, Unit.KM )
> 						.of().latitude(48.858333d).longitude(2.294444d)
> 					.createQuery();
> 
> 
> 
> 			builder
> 				.spatial()
> 					.onLatitudeField( "latitude" ).onLongitudeField( "longitude" ) or .forLatitudeField( "latitude" ).forLongitudeField( "longitude" )  or something else
> 					.within( 51, Unit.KM )
> 						.of().latitude( 24d ).longitude( 31.5d )
> 					.createQuery();
> 
> On 5 déc. 2011, at 16:21, Emmanuel Bernard wrote:
> 
>> Nicolas and I have made good progress on Geospatial queries for Hibernate Search.
>> 
>> # Geospatial indexing and queries
>> 
>> Our goal is to give a reasonable but pragmatic answer to geoloc queries. We do not try and implement the most obscure geo-projection favored by ancient greeks, we do not try and find matching elements within a triangular-shaped donut on Mars' surface etc. We have purposely limited the current implementation to:
>> 
>> - find matching elements in a circle (we have plans to extends to matching elements in a rectangle if popular demands arise but in our opinion this will not be useful or rather be misleading)
>> - use the internationally accepted geo projection as it is i18n neutral and not centered on one particular country. We can plan on opening to other projections if the need arise (esp if data points are provided in different projections).
>> 
>> We made sure to expose as few gory details as possible.
>> 
>> That being said, here are more information and questions.
>> 
>> The JIRA is https://hibernate.onjira.com/browse/HSEARCH-923
>> The branch is https://github.com/emmanuelbernard/hibernate-search/tree/HSEARCH-923
>> 
>> ## How is geoloc data exposed to the domain model?
>> 
>> We plan on supporting three approaches:
>> 
>> ### Special interface and embeddable object 
>> 
>> Using a specific interface as the property return type: `o.h.s.spatial.Coordinates`
>> 
>>   @Indexed
>>   public class Address {
>>       @Field String city;
>>       @Spatial Coordinates location = new Coordinates() {
>>           public double getLatitude() { ... }
>>           public double getLongitude() { ... }
>>       }
>>   }
>> 
>> ### Special interface implemented by the entity
>> 
>> Using a specific interface implemented by the entity: `o.h.s.spatial.Coordinates`
>> 
>>   @Indexed @Spatial
>>   public class Address {
>>       @Field String city;
>> 
>>       public double getLatitude() { ... }
>>       public double getLongitude() { ... }
>>   }
>> 
>> ### Use JTS's Point type
>> 
>> Use `Point` as the spatial property type.
>> 
>> ### (maybe) `double` hosted by two unrelated properties
>> 
>> The problem is to find a nice way to bind these properties to the spatial data.
>> 
>> ## How is geoloc data indexed
>> 
>> There will be two strategies
>> 
>> - index latitude and longitude as is and do a AND query matching both. This typically works nicely for small datasets.
>> - index latitude and longitude as matching a 15 level grid (from most generic to most specific). this typically works nicely for big datasets
>> 
>> ## Query DSL
>> 
>> We have worked to make a fluent spatial API to the current query DSL. Before we go on implementing it, we would like your feedback. Some points remains open.
>> 
>> ### General overview
>> 
>>   builder.spatial()
>>       .scoreByProximity() //not implemented yet
>>       .onField("coord")
>>           .boostedTo(2)
>>       .within(2).km()
>>           .of( coordinates )
>>       .createQuery();
>> 
>> ### onField
>> 
>> onField is not a good name. It has a slightly meaning than when it's used in range().onField(). We need to find a better name:
>> 
>> - onField
>> - onGrid
>> - onCoordinates
>> - onLocation
>> 
>> This really represents the metadata set where the location will be stored. In the boolean approach, we store latitude and longitude. In the grid approach, we store latitude,
>> longitude and the set of grids coordinates belong to.
>> 
>> .onField() does accept a field name which can be the `Coordinates` property or the virtual field used by the class-level bridge (if lat and long are top level properties).
>> 
>> When latitude and longitude are independent properties, we would use
>> 
>>   builder.
>>       .onLatitudeField("lat")
>>       .andLongitudeField("lat")
>> 
>> ### Surface checked
>> 
>> #### Option 1: centeredOn
>> 
>>   .centeredOn(double, double)
>>   //or
>>   .centeredOn()
>>     .latitude(double)
>>     .longitude(double)
>>   //or
>>   .centeredOn(SpatialIndexable)
>>   .centeredOn(JTS.Point) // hard dependency on JTS even for non spatial users :(
>>   .centeredOn(Object) //? to avoid JTS dep
>> 
>> - Should we have a version accepting Object?
>> - What is best, centeredOn(double, double) or centeredOn().latitude(double).longitude(double)?
>> 
>> #### Option 2: in / within
>> 
>> 
>>   //query within circle
>>   b.spatial()
>>       .onField("coord")
>>       .within(2).km()
>>       .of(SpatialIndexable)
>> 
>>       .within(2).km()
>>       .of()
>>           .latitude()
>>           .longitude()
>>        .createQuery()
>> 
>>  //or with a different unit handling
>> 
>>   //query within circle
>>   b.spatial()
>>       .onField("coord")
>>       .within(2, Unit.km)
>>       .of(SpatialIndexable)
>> 
>>       .within(2, Unit.km)
>>       .of()
>>           .latitude()
>>           .longitude()
>>        .createQuery()
>> 
>> My reason to support units is that a. it's explicit and b. when those geosuckers improve, we could support time units like mins or hours. Note, that's a very hard problem to crack and solutions are resource intensive and not very accurate. None really do it correctly, not Google for sure.
>> 
>> 
>> We could support rectangles / boxes if really needed
>> 
>>   //query in box
>>   b.spatial()
>>       .onField("coord")
>>       .inBox()
>>           .from()
>>           .to()
>>        .createQuery();
>> 
>>   //more formal but more correct wrt projection
>>   b.spatial()
>>       .onField("coord")
>>       .inBox()
>>           .withUpperLeft()
>>           .withLowerRight()
>>        .createQuery();
>> 
>> 
>> Please give us your feedback.
>> 
>> ## TODOs
>> 
>> - Implement fluent DSL
>> - Implement Special interface implemented by the entity
>> - Implement  Use JTS's Point type
>> - Implement bridge that supports indexing for boolean queries based on lat and long instead of the grid.
>> - Implement @Spatial as a marker annotation for the spatial bridge
>> - Implement variable score based on proximity
>> Today we use constant score, ie in = 1, out = 0. We can think about a score that goes from 1 to 0 based on the distance from the center to the circle size
>> We can imagine queries that should return close elements above far elements.
>> Note we might need a score going from 1 to .5 or some other value. Need to think about that.
>> - Write how to focused doc
>> - Write doc on perf comparing grid vs boolean queries
>> - Convert to JBoss logging
>> - Add unit test using faceting + spatial queries
>> 
>> Emmanuel
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> 
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev





More information about the hibernate-dev mailing list