Lucene Spatial


http://talks.thesteve0.com/lucenespatial

Presented by:

Steven Pousty

@TheSteve0 on Twitter, IRC, Ingress, SmugMug, Skype, and Github

Agenda

  1. A little about Lucene Spatial
  2. See some demos

Assumptions

  1. You write some code
  2. You will ask questions

About Lucene

 

Doug Cutting - 1999

Some facts

  • It's main focus has always been text indexing and search
  • Main project in Java, but ports to Python, .NET, and most other languages
  • Has no server Daemon - similar to SQLite
  • Apache top-level project
  • Why not a database?

    • Lucene has amazing analysis of text values
    • Handles internationalization with ease
    • Can handle incremental updates
    • Compiled flat file with inverted indices means we can get incredibly fast speeds

    Various Lucene Projects

    PROJECT NAMELuceneSOLRElasticSearchHibernate Search
    FOCUS Underlying library Enterprise search platform Distributed real-time search and analytics engine for the cloud ORM for Lucene
    HANDLES SPATIAL Spatial4J and Lucene Spatial Uses the Lucene Implementation Uses the Lucene Implementation Custom
    LINK TO GET STARTED Stack Overflow Solr ElasticSearch Hibernate Search

    General Flow in Lucene

     

    The pieces in Lucene Spatial

    Spatial4J

    • Provides the geographic shape.
    • Understand all the hard Geography and Geometry bits

    Lucene Spatial

    • Provides the indexing stratgies
    • Also provides the ability to do the spatial queries with the shapes

    Lead maintainer is Dave Smiley who is freelance search consultant / developer

    How does Lucene Spatial work - indexing

    1. You create a spatial strategy that determines how things will be indexed
    2. Then for each "document", you create one (or more) spatial objects to store in a field
    3. The you use the spatial strategy to transform the object into a indexable field
    4. Add the field to the document
    5. Add the document to the index
    6. Commit and Save your index

    How does Lucene Spatial work - searching

    1. You create a spatial strategy that determines how things will be searched - must match index strategy
    2. Then you create a shape that you want to use in your search
    3. Make spatial arguments - shape and spatial operation (i.e. intersects)
    4. Use the arguments to make a filter or query
    5. Do the search using query or filter

    Indexing - main one is RecursivePrefixTree

    Use Geohash or Quad tree

    Future

    1. A lot of improvements coming
    2. Better Spatial Indexing
    3. Better Handling of Geodesic
    4. Better Benchmarking
    5. HEAT MAPS!!!!

    Too much talk - CODE

    The indexer code (Can run anywhere you havea JVM)

    Spatial Lucene Indexer

    The REST web service (ready to run on OpenShift)

    Lucene Spatial

    How you would use OpenShift to spin this up

    rhc app create myapp tomcat7

    OR

    rhc app create myapp jbosseap

    One Source to Bind Them All


     

     

    Let's wrap it up

    1. Spatial + Full Text is easy and fun with Lucene
    2. Easy to get up and running