The other day someone asked me about whether Tableau could generate routes between locations as part of the spatial analysis functionality.  As a built-in function, no…but I’m always up for a good challenge, so I built up a demo for dynamic route calculations using a PostgreSQL database with PostGIS and pgRouting.  For fun, I’ve also tied it to some parameter actions so that it’s easy to click on a point to update the origin or destination of the route.

Sound like fun?  Want to try it yourself?  Here’s some detail on how it all works…

seattle_routing_animation.gif

 

Summary of the process:

  • Set up a PostgreSQL database with the PostGIS and pgRouting extensions enabled
  • Add in a topological roads dataset (I use data from OSM, it’s pretty much ready to go for routing)
  • Create a Tableau workbook with some custom SQL to drive the route generation using the Dijkstra algorithm implemented in pgRouting
  • Bling up your workbook with parameters (and parameter actions) and fancy symbolization

 

Ingredients:

While we’ll visualize our calculated routes in Tableau, the analytics behind the scenes will be driven by a PostgreSQL database with PostGIS and pgRouting.   To simplify this post, I’m going to assume that you are set up with a few tools:

  • PostgreSQL – I installed the latest version from EnterpriseDB
  • PostGIS – I installed from OSGEO (this will get you postGIS, osmconvert, and osm2pgrouting)

 

You’ll also need some network data to work with.  All of the data that I use in this demo are from Open Street Map (road network and points of interest).  If you want to use OSM data, I recommend:

  • OSM HOT Export Tool – you’ll need to create an account, but then it’s super easy to set your area of interest and grab all sorts of data
  • osmconvert – use the Windows 32bit version (this will run on Win 64 as well).
  • osm2pgrouting – if you install PostGIS using the OSGEO bundle you should already have this ready to go

 

If you want to make your own network from non-OSM data, it’s totally do-able, just not as easy as using the tools above.  Maybe I’ll figure it out one of these days and write another blog post on it.

 

Routing in Tableau

For the sake of expediency, I’m not going to go through all of the gory details of installing PostgreSQL, etc.  I’m going to assume that you’re already set up with the required tools and will go straight to the process of grabbing data, converting OSM data to something routable with pgRouting, and then exploring in Tableau.

 

Here are the steps that we’ll walk through:

  1. Export point and road network data from OSM
  2. Process the exported data (.pbf file) and create an .osm file
  3. Import the data into your PostgreSQL database and convert into the format expected for pgRouting queries using osm2pgrouting
  4. Set up custom SQL in Tableau workbook to drive route calculation

 

1. Export data from OSM

The most important part of routing is having a good set of network data to work with.  Your network data must have some topological smarts in order for the pgRouting functions to work (e.g., the line segments all know how they are connected to one another).

 

I’m sure there are many places to obtain good network data, but the one that I use is Open Street Map.  The road network from OSM is already in a good format to use with pgRouting and needs minimal processing.  I also like the OSM HOT (Humanitarian OpenStreetMap Team) Export tool. I’ll walk through the process of exporting data, but if you want to just use the dataset that I worked with for this post, you can grab it from OSM using this link if you want to look at the specifications of the selection – note that you’ll probably still need to make an account with OSM to download it, or you can just take the zip file of the pbf that I’ve attached to this post.  If you opt to download your own data, it’s easy with the HOT Export tool. You can just select your area of interest:

Select from a variety of data file formats for export (I’m using OSM .pbf files, because that is what works with the osm2pgrouting tool):
Pick what OSM data you want – there is a LOT to choose from.  I’m keeping things simple and just grabbing roads and restaurants:

And then create the export. It may take a few minutes to run off the export, depending on the size of the area and number of features you are exporting.  My Seattle export only took a minute or so.

 

2. Process the data

Your exported data should be a .pbf file.  Next step is to convert it to a .osm file (which we will then push directly into PostgreSQL). The easiest way to do this is with the osmconvert tool.  You can download the tool from OpenStreetMap and drop the .exe in the same folder as your .pbf file and run it from the command line.  Easy peasy:

 

osmconvert Seattle_OSM_Data/seattle_export.pbf --drop-author --drop-version --out-osm -o= Seattle_OSM_Data/seattle_export.osm

 

Now we have the .osm file we need to import into a database.  Next up, create the database and drop in our data.

 

3. Import to PostgreSQL for use in pgRouting

 

Assuming you already have PostgreSQL and PostGIS installed, you need to start out with a database that is ready for routing…

 

Make a database (from command line using create database or through the pgAdmin interface):

 

Open a SQL query window and enable some extensions in this new database – this will enable some basic postGIS spatial analytic functionality, the routing functionality in pgRouting, and the hstore extension.

 

 

CREATE EXTENSION postgis;

CREATE EXTENSION pgrouting;

CREATE EXTENSION hstore;

 

The last step is to convert your .osm file into all the little tables we need in PostgreSQL to do routing with pgRouting.  You could probably do this by hand with a billion commands, or you can just use osm2pgrouting.  This will grab the data you want from the .osm file and push it into all the right tables in your database.

 

Note that the osm2pgrouting tool doesn’t import all of your data into the database; it only takes what you tell it to.  To tell it what data to grab, we’ll create an XML file with the configuration that we want for the data.  It just specifies which tags to look for.

 

I patterned my XML file for routing after this one and made some modifications based on the points of interest that I downloaded (restaurants).  My new XML looks like this (mapconfig.xml attached to the post if you want to just download and use that):

Or, if you want to copy/paste to build off of:

<?xml version="1.0" encoding="UTF-8"?>

<configuration>

  <tag_name name="highway" id="1">

    <tag_value name="motorway" id="101" priority="1.0" maxspeed="130" />

    <tag_value name="motorway_link" id="102" priority="1.0" maxspeed="130" />

    <tag_value name="motorway_junction" id="103" priority="1.0" maxspeed="130" />

    <tag_value name="trunk"             id="104" priority="1.05" maxspeed="110" />

    <tag_value name="trunk_link" id="105" priority="1.05" maxspeed="110" />   

    <tag_value name="primary" id="106" priority="1.15" maxspeed="90" />

    <tag_value name="primary_link" id="107" priority="1.15" maxspeed="90" />   

    <tag_value name="secondary" id="108" priority="1.5" maxspeed="90" />

    <tag_value name="secondary_link" id="109" priority="1.5" maxspeed="90"/> 

    <tag_value name="tertiary" id="110" priority="1.75" maxspeed="90" />

    <tag_value name="tertiary_link" id="111" priority="1.75" maxspeed="90" /> 

    <tag_value name="residential" id="112" priority="2.5" maxspeed="50" />

    <tag_value name="living_street" id="113" priority="3" maxspeed="20" />

    <tag_value name="service" id="114" priority="2.5" maxspeed="50" />

    <tag_value name="unclassified" id="117" priority="3" maxspeed="90"/>

    <tag_value name="road"              id="100" priority="5" maxspeed="50" />

  </tag_name>

 

  <tag_name name="amenity" id="2">

    <tag_value name="restaurant" />

    <tag_value name="fast food" />

    <tag_value name="cafe" />

    <tag_value name="pub" />

  </tag_name>

 

</configuration>

 

Once you have your .xml file ready, you just have to run osm2pgrouting from the command line. Note that I am using a direct link to the location for the osm2pgrouting.exe.  In the text below, I also put all of the assorted flags on separate lines below so it was easier to read, but it is really all just one long single command (see the image below).

"C:\Program Files\PostgreSQL\11\bin\osm2pgrouting.exe"

--f Seattle_OSM_Data/seattle_export.osm

--conf Seattle_OSM_Data/mapconfig.xml

--dbname routing

--username USERNAME
--password PASSWORD

--clean

–-addnodes

 

 

It may take a while to load up all of the data into your database (depends on size of the dataset…the Seattle dataset takes less than a minute on my computer, but it’s not a huge dataset) .

 

After you do this, you should end up with a set of tables like this – all of these tables are going to be used by pgRouting to figure out how to traverse your road network and figure out the shortest path:

What are these tables? Here is a quick run down of the main tables…

 

Ways – this is the primary table we’ll use for routing. The table has all of the segments of the road network with attributes that we can use for calculating the ‘cost’ of traversing each segment (e.g., length, speed, whether or not it’s one way, etc.)

Osm_ways – Segments of the road network with simplified attributes

Osm_nodes – a ton of points.  All of our amenities + it looks like all of the nodes at the start/end of each road segment.

 

 

Pointsofinterest – this has all of the ‘amenity’ points of interest that we grabbed from OSM (left image below) as well as a ton of other points that have null attributes (right image below; I have no clue what these are, but I probably accidentally grabbed extra data or they are an artifact of the road network):

To make things easy on myself, and to speed up querying, I decided to streamline the pointsofinterest table to make a new table with just the ‘amenity’ points of interest.  This is a quick SQL query that reduces us from 22,416 rows of data (pointsofinterest) to 2,068.

 

 

create table pointsofinterest_amenity as

select *

from pointsofinterest

where tag_name = ‘amenity’

 

Once you have your data all set up, all we need is a little custom SQL to make it the basics work in Tableau.   While we’re at it, we will also add some extra bling with Parameter Actions and fancy symbolization…

 

4. Setup in Tableau

We’ll start by just looking through the data and then will move to some queries to explore routes in Tableau.

 

Connect to your database in Tableau:

 

All of the data for this has to live in the PostgreSQL database so that we can run custom queries against it.  So, we’ll start by connecting to the database and exploring the data.

 

 

Explore the content:

As noted above, the most important table for routing is ‘Ways.’  This is the table that stores all of the good info like segment geometry, unique vertex ids for the start and end node of each segment, length, speed, whether or not the segment is one-way, etc.

 

If you open up the table in Tableau and add the geometry (the_geom) to a worksheet you’ll see all of the road segments in the dataset.  For context, it helps to use a basemap with the road network visible (add streets).

 

 

Here is a subset of the ways table in our database:

 

 

Each of those line segments has a ton of attributes attached to it:

 

The other super important table is the pointsofinterest_amenity table that we made. These are the amenities between which we will calculate routes… how do I get from one restaurant to another? 

 

Calculate a route

So, how do we use these tables to find the path from one location to another?  We need to write a simple bit of SQL to calculate the route using functionality in pgRouting.  We will use the Dijkstra algorithm (pgr_dijkstra).  This will return the shortest path from a starting vertex to ending vertex.

 

Let’s try this out with some custom SQL…

 

In a new worksheet, let’s add another copy of our database as the data source and write some custom SQL:

Select * from pgr_dijkstra(

   'select gid as id, source, target, length as cost

from ways', 44479, 29980, false)

 

This gives us a result table with an ordered list of edge segments from vertex 1 (44479) to vertex 2 (29980), as well as some basic information about the ‘cost’ of the path (in this case it’s just based on length of the segment).  This is a good start, but it isn’t a path that we can visualize easily in Tableau, because there is no geometry!   We’ll need to add that in.

 

While I guess you could use a join in Tableau to join the edges back into this result table using the node ID, I think it’s easier to just put it in the query to begin with… all we do is going the geometry from our ways table to the result of our routing query using the geographic ID of the road segments.

 

select the_geom as geom_route, gid, source, target, length

from (

    Select * from pgr_dijkstra(

   'select gid as id, source, target, length as cost

from ways', 44479, 29980, false)

) as route, ways

where route.edge = ways.gid

 

With this new query, we now have our geometry as well as some interesting attributes like segment length.

 

 

We can map this easily in Tableau by just dropping the geom_route onto a viz.

 

Proof of concept complete – we can use PostgreSQL, PostGIS, and pgRouting to calculate routes in Tableau…but how do we make it dynamic?  I don’t want to type in the start and end nodes manually each time I need to run the query…and I don’t even want to have to look up the start and end nodes on the road network!

 

Automatically determine the corresponding nodes for locations of interest

 

First thing we can do is add in some subqueries to take the location of a point of interest and return the appropriate node on the road network.

 

To find the appropriate node, I use a quick subquery query like this to select the source ID value from the ways table where the geometry is closest to the selected point of interest.  The query does some calculations of distance between the node with the specified source ID (osm_id) and the roads in the ways table, then selects the 1 road that is closest (the ‘limit 1’ in the query).  

In this query, I’m also using a parameter in Tableau (highlighted below) to drive the selection of the point of interest ID using a parameter drop down so that I don’t have to type that in each time. We’ll set up that parameter next so you can see how it works… I’m just going to do a quick explanation of this part first and then we’ll put it all together into a single query in the next section.

 

--source location

(select source

from ways

order by ways.the_geom <-> (

-- select one node

SELECT the_geom FROM pointsofinterest_amenity WHERE osm_id = <Parameters.Origin Name>

)

limit 1)

 

The query to select the destination location would look exactly the same, except we will change to use a parameter to set the Destination Name.

 

To explain one of the weird parts of the query… what does <-> mean???  The <-> returns the distance between two geometries. By ordering the results and then taking the top result (“limit 1”) we can get the nearest neighbor.  Keep in mind that this method is actually finding the nearest way (road segment) and will then snap the point to the origin of that. This means that sometimes you’ll find that the selected origin node is up to a block away…for instance the starred location below is closest to NE 80th St, and the origin of the segment that it’s closest to is actually at the intersection at the other end of the block. For my purposes this was sufficiently close that I didn’t look into other solutions to ensure that the point snapped to the closest intersection, but you may want to put more time into this if it’s important for your application.

 

Automatically update the start and end node in the query

As I mentioned in the last section, I used a parameter in the query above, so let’s quickly look at setting that up so that our query can be dynamically driven from the viz.

 

In order to select the origin and destination locations using a parameter drop down on the viz, we need to create the parameter and populate it with useful information. Ideally, users will have a list of familiar names to use when selecting the location in the drop down and Tableau will have a set of unique IDs to use to pass that information on to our SQL Query.

 

To allow us to do this, we need to create two parameters: Origin Name and Destination Name.  For each of these, we can populate the list of values with the Name and ID from our points of interest.  To make it easy to create the list of values, you’ll want to just copy/paste the values into the parameter.  There are a lot of ways to do this, but Tableau can help make it a bit easier.

 

The way that I normally do this is to make a viz with just the point locations for the points of interest, then selected them all, look at the data table, copy the value and display name from the data table in Tableau, and paste them into Excel (or a text editor).

 

 

Once I pasted these into Excel, I pulled out the nulls and empty rows, and concatenated the ID to the restaurant name (not needed in all cases, but I did it here because some of the restaurant names are duplicated since they are chains and Tableau parameters need to be unique). Once my data was cleaned up, I copied the two columns and pasted the values into the parameter from the clipboard.  Now we have a parameter that we can use for selecting points of interest and getting the ID value easily.  We can just duplicate this parameter and rename it so that we have one each for selecting the origin and the destination.

 

Now we have a quick way of obtaining the ID we want in the custom SQL query and we can drop in the appropriate parameters in our query.  Here is what that looks like for adding in the Destination name parameter and putting everything into a single query.

 

Now we can put in a single query and obtain the route between any two selected locations:

select the_geom as geom_route, name, osm_id

from (select * from pgr_dijkstra('

select gid as id, source, target, length as cost from ways',

                   --source location

                   (select source

                   from ways

                   order by ways.the_geom <-> (

                        -- select one node using parameter

                        SELECT the_geom FROM pointsofinterest_amenity WHERE osm_id = <Parameters.Origin Name>

)limit 1),

--destination location

(select source

from ways

order by ways.the_geom <-> (

-- select one node using parameter

SELECT the_geom FROM pointsofinterest_amenity WHERE osm_id = <Parameters.Destination Name>

)limit 1), false)) as route, ways

-- join the geometry back into the query result

where route.edge = ways.gid

 

Now we can build up our viz with parameter drop downs to guide the routing:

 

When you first create your viz and run the query you might get an empty table – most likely that is because if you duplicated the Origin Name parameter to create the Destination Name parameter the two are set to the same value; the route from a location to the same location doesn’t exist, because they are in the same place:

 

Just change the value for one of the parameters and you should see a route.

 

Improved symbolization with dual axis mapping

While the routes are cool, they would be even better if we had both the routes AND the points of interest to add additional context.  We’ll need a dual axis map to get both of those on the same map.  A bunch of tips on how to do that can be found in another blog post on Dual Axis Mapping Many Ways.  But, here is the quick “how to” with the data in this post – we just need a full outer join between the results of our custom SQL query and the points of interest table with the join clause set to 0=1 so that we have all rows from both tables:

Now we can build up a dual axis map with both points of interest (the little blue dots) AND the route from the custom sql (the red line):

 

We can get crazier with symbolization as well and highlight the selected origin and destination with some use of the size and color shelves (and a few calculated fields):

 

 

 

 

Improved selection with parameter actions

The parameter drop downs are nice, but it would be even better to just select a location on the map and set that as origin or destination automatically.  We can do that using parameter actions (Tableau 2019.2+).

 

It is easy to set up the parameter action – just add a new action to the worksheet

 

set it as type ‘Change parameter’

 

 

and have it run based on a field that matches the value in the parameter.  In this case it is the OSM ID as a string.  To get a match to the value, I had to change the data type of the OSM ID to a string. With other datasets you may not need to do this…you just need a field that is going to match the parameter name exactly…and you need to make sure the field is on the viz (I dropped my new field on Detail).

 

Both of the parameter actions are set to run from a MENU so that we can click on a point and have a choice of whether it is set as origin or destination:

 

 

Once we pick one of the options, the query will re-run and the path will update on our viz:

 

 

And that is all of the magic to do routing in Tableau!

 

Let me know what you create!