miércoles, 22 de noviembre de 2017

Playing with Redis Geo structures

One of the things I've never used in Redis were the commands provided to store and access geolocated data.   From version 3.2 Redis includes 6 new very simple commands that can be used to store tags associated with coordinates and calculate distances between those tags or find the tags around some specific coordinates.

Let's try to build a simple prototype to see how it works.   Imagine that you have a service where users can rate anything (people, places, restaurants...)* and at some point you want to show in the UI of your app what things other people around you are rating right now. 

Almost every database right now has support to store this type of geographically located information but for use cases like this one where you want very fast access to ephemeral information an in-memory database can be a very good choice.

Storing data

The GEOADD command in Redis is the one you have to insert a new tag asociated with a specific position (latitude, longitude).

The structure supported in Redis for geolocated data has an insertion and query time that is O(log(N)) complexity where N is the number of items in the set, so probably you don't want to have all the data in the same set but partition it by country or some other grouping that makes sense for your use case.   In our example we could try partitioning it per city.

So everytime somebody rates something identified by a tag we will do this insert in redis:
GEOADD city latitude longitude tag
For example:
GEOADD sanfrancisco -122 37 goldengate
Redis stores this geographical information internally in a Sorted Set, so we can use any of the sorted set commands to manipulate or retrieve the list of items stored:
ZRANGE sanfrancisco 0 -1
1) "goldengate"

Retrieving data

There are two commands that you can use to make geographical queries on the stored data depending on your use case:


For our use case, everytime somebody opens the app we will retrieve all the tags around his position sorted by distance.

GEORADIUS sanfrancisco -122.2 37.1 5 km
1) "goldengate"

How does it work internally

As we mentioned before everything is stored inside Redis in the existing Sorted Set structures.

The way those zsets are leveraged are by using a score based on the latitude and longitude.  Basically by generating the zset scores interleaving the bits of the latitude and longitude of each entry you can later make queries to retrieve all the tags in a specific geographical square as a range of those scores.

That way with 9 ranges you can get all the areas around a specific point.  And those ranges can be of any size to be able to make queries using different radius just by trimming bits at the end of the score.

This technique is called geohashing and makes this geo commands very easy to implement on top of sorted sets.

Hope this is useful for other people implementing similar services, the truth is I never stop being amazed by Redis...

* Disclaimer: I built that service with some friends, you can see it in http://www.pleason.com/

No hay comentarios:

Publicar un comentario