FOSS4G - Lucene Spatial


http://talks.thesteve0.com

Presented by:

Steven Pousty

@TheSteve0 on Twitter, IRC, and Github

Agenda

  1. A little about Lucene Spatial
  2. Learn a bit about PaaS
  3. See some demos

Assumptions

  1. You know spatial
  2. You write some code
  3. You will ask questions

About Lucene

 

Doug Cutting - 1999

Some facts

  • It's main focus has always been text indexing and search
  • Main project in Java, but ports to Python, .NET, and most other languages
  • Has no server Daemon - similar to SQLite
  • Apache top-level project
  • Why not a database?

    • Lucene has amazing analysis of text values
    • Handles internationalization with ease
    • Can handle incremental updates
    • Compiled flat file with inverted indices means we can get incredibly fast speeds

    Various Lucene Projects

    PROJECT NAMELuceneSOLRElasticSearchHibernate Search
    FOCUS Underlying library Enterprise search platform Distributed real-time search and analytics engine for the cloud ORM for Lucene
    HANDLES SPATIAL Spatial4J and Lucene Spatial Uses the Lucene Implementation Uses the Lucene Implementation Custom
    LINK TO GET STARTED Stack Overflow Solr ElasticSearch Hibernate Search

    General Flow in Lucene

     

    The pieces in Lucene Spatial

    Spatial4J

    • Provides the geographic shape. Can't use JTS because it is GPL (insert mini-rant here)
    • Understand all the hard Geography and Geometry bits

    Lucene Spatial

    • Provides the indexing stratgies
    • Also provides the ability to do the spatial queries with the shapes

    Lead maintainer is Dave Smiley who works for Mitre

    How does Lucene Spatial work - indexing

    1. You create a spatial strategy that determines how things will be indexed
    2. Then for each "document", you create one (or more) spatial objects to store in a field
    3. The you use the spatial strategy to transform the object into a indexable field
    4. Add the field to the document
    5. Add the document to the index
    6. Commit and Save your index

    How does Lucene Spatial work - searching

    1. You create a spatial strategy that determines how things will be searched - must match index
    2. Then you create a shape that you want to use in your search
    3. Make spatial arguments - shape and spatial operation (i.e. intersects)
    4. Use the arguments to make a filter or query
    5. Do the search using query or filter

    Indexing - main one is RecursivePrefixTree

    Use Geohash or Quad tree

    Too much talk - CODE

    The indexer code (Can run anywhere you havea JVM)

    Spatial Lucene Indexer

    The REST web service (ready to run on OpenShift)

    Lucene Spatial

    Let's spin up a Tomcat 7 web application and look at the app

    One Source to Bind Them All


     

     

    But wait - there's more

    1. Free! No time limit
    2. 3 gears (like servers) - each 512 Mb RAM, 1 Gb disk
    3. Auto-scaling
    4. Simple pricing

    Let's wrap it up

    1. Openshift makes life great for you
    2. Spatial + Full Text is easy and fun with Lucene
    3. Free!

    Come hang out with us:
    #openshift on freenode irc
    OR
    users@lists.openshift.redhat.com