Skip navigation
2019

Recently I’ve been thinking about network analyses in Tableau.  Mostly this has been on how we can dynamically generate routes between locations with a bit of help from PostGIS and custom SQL queries.   The problem that I explored in my post on dynamic routing was focused on a one-to-one relationship on the network…simply how to I go from one location to one other location.  This is interesting, but there are richer questions that we can explore if we expand and explore the network that radiates out to any location from one or more starting points.  This type of road network problem is often calculated as a drive time polygon or a series of polygons that approximate drive time ‘isochrones’ or lines of equal time.

 

While there are several ways to tackle this problem, in this post, I am going to explore this problem using PostgreSQL and PostGIS.  There are good and bad sides to all of the analytic options, and for network travel time / distance, ‘your mileage will vary.’  (yes, I make that joke a lot when talking about spatial analysis problems). Some good reading materials that I recommend to use when considering other methods for tackling your network analysis problems:

 

 

On the general problem of network analysis and drive time calculations

Drivetime calculation is a problem of figuring out the distance from one location to many locations along a network – we start with a single node on our road network then measure out one segment at a time until we reach our maximum distance of interest. By traversing the network we end up with all of the roads and nodes that are within the distance that we set.

While I was figuring out how to do this in PostgreSQL, there were a number of realizations that came to me when I was considering how someone would use these analyses – and, more importantly, how they would understand what was happening and how good / useful the results are.  The main points that I think are worth sharing:

 

  • You need to start with a good road network you start (and I think it helps to be able to dig into the data to do some quality control / assessment).  I used Open Street Map (OSM) data.  It seems fine.  I like that I have full access to peruse the attributes in the data and to do a sanity check on the attributes that I use in calculating ‘cost’ to travel segments on the network.  The OSM data includes several attributes that can be used as a cost, including distance and number of seconds to travel any given segment (based on speed limit and length). It’s great that OSM includes these attributes, because they are a good place to start.  But…you might need to use other metrics to tailor to your specific need or based on your local knowledge.  For instance, speed limit may not mean much if there tends to be a lot of traffic. Or, you may need to adjust for mode of transportation and individual (e.g., walking speed varies substantially between individuals).  The calculations for ‘cost’ on a road network all depends on the quality of the metric you use for each segment.

 

  • The quality of the analysis is based on a combination of road network and the mysteries of the algorithm to calculate ‘cost’ on the network.  The methods behind the PostGIS and pgRouting driving distance calculation are fairly well explained.  I like that. But…the standard implementation of the pgRouting methods on the OSM data still felt a bit unrealistic and overly generous with respect to traveling a network for Seattle.  Maybe my mental map of travel times around town are skewed because I tend to walk, bike, and bus more than I drive, but they seemed pretty generous with respect to how far I could travel within the max time that I set. My guess is that is because of the way that the costs are calculated for road segments.  Seattle has a lot of traffic.  And construction.  And traffic. And lots of intersections that feel like they have poorly timed lights or are in desperate need of a turn arrow for left turns.  But, I couldn’t figure out a way to incorporate the intersection challenges into the drive distance calculation – so the impact of that one spot where I sit for 5 minutes waiting to turn left?  Not going to be included when I look at ‘what is within 10 minutes’ of a location.

 

On the other hand, there are online services that you can use for drive time network calculations that may be more realistic with respect to using these additional driving impediments, but they’re just black box services.  It’s hard to know much about what is really going on under the hood.  They may be better.  Or worse (though in a different way).  Or it may just be something you can chalk up to ‘it’s just different.’  They may be underestimating drive distance, where the pgRouting-based calculations may over-estimate drive distance.  This is one of those cases where ‘your mileage may vary’ – literally.

 

For an example, this image shows the 5 and 10 minute drive time isochrones calculated with the Mapbox isochrone API and the 10 minute drive time using pgr_driving_distance (with cost based on seconds to traverse a segment) and a concave hull around the result.  Big difference… can I absolutely say that one is the right answer and the other is wrong?  No.  They are just different ways of approximating the same thing.  Depending on the quality of your input data and the algorithm used to do the calculation you will get a different result.  To me it’s a question of how much you need to understand what is happening under the hood and how much time you’re willing to spend curating the input data, attributes, and the model traversing the network.

 

 

How to calculate drive time road networks and polygons for use in Tableau using PostgreSQL

 

While there are different ways of calculating these drive time polygons, as I mentioned earlier, I’m just going to explore the PostgreSQL-based solution, since Nick Hara has already nicely documented the TabPy + Mapbox API solution.

 

Now, on with the calculating… Just like in my last post on network analysis, this work will be driven off of some key ingredients:

  • PostgreSQL – I installed the latest version from EnterpriseDB
  • PostGIS – I installed from OSGEO (this will get you postGIS, osmconvert, and osm2pgrouting)

 

And some topological network data.  I use Open Street Map data and recommend:

  • OSM HOT Export Tool – you’ll need to create an account, but then it’s super easy to set your area of interest and grab all sorts of data
  • osmconvert – use the Windows 32bit version (this will run on Win 64 as well).
  • osm2pgrouting – if you install PostGIS using the OSGEO bundle you should already have this ready to go
  • I also grabbed a point dataset (hospitals) from the Seattle open data portal

 

The basics - drivetime calculations

Drivetime calculation is a problem of figuring out the distance from one location to many locations along a network – we start with a single node on our road network then measure out one segment at a time until we reach our maximum distance of interest. By traversing the network we end up with all of the roads and nodes that are within the distance that we set.

 

With pgRouting, this is a simple process of querying the data with pgr_drivingDistance  This function can be used to identify all of the nodes or line segments in the road network that have costs that are less than, or equal to, the maximum ‘cost’ distance that is used in the query.

 

For instance, if we want to know all of the road segments that are within 3 units (miles, meters, minutes, whatever you are using as cost) from an origin point, the result would look like this on a super-simple theoretical road network:

 

While the calculation itself is simple, there are a few limitations to consider that make the problem more complex.  Some of the limitations that I’ve been thinking about lately include:

  • Is the measure you are using for ‘cost’ really reflective of cost for that segment?  For instance, if you have a travel time in seconds that is based on speed limit and segment length, is that a good assessment of travel time?  What if it’s common to have slow-downs based on traffic, or construction, or other common driving hindrances (e.g., the vehicle trying to turn left at an intersection without a green turn arrow…and they block all of the traffic behind them until a car in the oncoming traffic is nice enough to let them turn.  Sigh.)
  • Can you include ‘cost’ at intersections as well as at the road segments (e.g., average wait time for traffic lights, right turn vs. left turn cost, etc.)
  • Does the analysis need to reflect one-way roads (e.g., you can’t drive the wrong way, but you can walk the ‘wrong’ way on the segment…)
  • How do you convert the resulting road segments or nodes into polygonal ‘drive-sheds?’
  • And probably all sorts of other potential issues…

 

Before we bog down too far into limitations, let’s just build some data and see what we can do with it.

 

To run a query on a road network with pgr_drivingDistance, we need to specify a few key bits of information:

  • the network details – our line segments (ways)
  • the origin node on the network – the location from which we are measuring
  • what is the ‘cost’ for traversing each segment – this can be anything you want…distance, time, etc.  The example below was super simple and used the same cost for each segment, but any numeric value should be usable to determine cost.

 

Doing the calculation to visualize it in Tableau is just a simple bit of custom SQL – note that I’m using the same database that I set up in my last post on routing, so if you want to play along the instructions on setup are over in that post.

 

First we create our table using custom SQL:

And use a query like this:

select * from pgr_drivingDistance(

        'select gid as id, source, target, cost_s as cost

         from ways',

        -- origin node

           (select source

           from ways

           order by ways.the_geom <-> (

                -- select one node using parameter

                SELECT wkb_geometry

                FROM hospitals_4326

                WHERE ogc_fid = <Parameters.Select Hospital>

            )limit 1),

 

        -- distance (in seconds)

            <Parameters.Time (seconds)>,

 

        -- undirected (can travel in both directions on a road segment)

           false) as route, ways_vertices_pgr as vertices

where route.node = vertices.id

 

This query peruses the street network to find all of the segments within a set distance (defined in whatever units are appropriate…here were using number of seconds) of the origin location (defined with a subquery to identify the origin node based on the ID for a specific point of interest).  The result is then joined back to the table that has the geometry for the nodes in the road network.  That last join (where route.node = vertices.id) gives us a geography column that we can visualize in Tableau.  The original query would just return the list of IDs and not have any geometry that we could map in Tableau.

 

In this case I’m using ‘cost_s’ as the cost – this is just the cost in seconds to traverse the segment (based on segment length and speed limit), but the cost can be any value associated with a segment.  For instance, OSM data also comes with a ‘length_m’ field if you just want distance, or you can calculate out your own cost using other attributes of interest (e.g., average walking speed, etc.).

 

And gives us a result that we can map in Tableau like this:

 

 

Or, instead of returning the nodes on the road network, we could join in our road segments (ways) and see the results as lines instead.

select * from pgr_drivingDistance(

        'select gid as id, source, target, cost_s as cost

         from ways',

        -- origin node

           (select source

           from ways

           order by ways.the_geom <-> (

                -- select one node using parameter

                SELECT wkb_geometry

                FROM hospitals_4326

                WHERE ogc_fid = <Parameters.Select Hospital>

            )limit 1),

 

        -- distance (in seconds)

            <Parameters.Time (seconds)>,

 

        -- undirected (can travel in both directions on a road segment)

           false) as route, ways

where route.edge = ways.gid

 

Which gives us a result with road segments like this:

 

Creating polygon ‘drive-sheds’

 

While the result from these queries are pretty, they don’t necessarily help with some key analytical questions that we might want to answer, like how many customers (or insured properties or hospitals or delivery trucks or competitors, etc.) are within 5 minutes? 10 minutes? Etc.   To answer this type of question, we need to convert the results of our network analysis into polygons and use the built in Tableau spatial intersection join type.  The way that I’ve been doing this is to use a convex or concave hull around the lines or points returned from the query.

 

A convex hull is the minimum geography encompassing all points in the dataset

 

A concave hull is similar to a convex hull, but is more of the ‘shrink wrapped’ version that fits tighter to the dataset. You can specify a target percent for the area of the hull that PostGIS will try to fit.

 

If this sort of quick result is useful for exploration, you can just create a single convex hull around the result.  You can either do this as a RAWSQLAGG_SPATIAL calculated field, or as part of the original query.  Here is the example for a concave hull as a calculated field:

 

Or you can put it in the original custom SQL.  The advantage of having it as part of the custom SQL is that you could then use the resulting polygon easily as part of a spatial intersection.  This would let you identify all of the points that fall within the driving area with a quick spatial join.   The new parts of the query are in bold below:

 

select st_convexhull(st_collect(the_geom))::geography(polygon) as isochrone from (

select * from pgr_drivingDistance(

            'select gid as id, source, target, cost_s as cost

             from ways',

            -- origin node

               (select source

               from ways

               order by ways.the_geom <-> (

                    -- select one node using parameter

                    SELECT wkb_geometry

                    FROM hospitals_4326

                    WHERE ogc_fid = <Parameters.Select Hospital>

                )limit 1),

 

            -- distance (in seconds)

                <Parameters.Time (seconds)>,

 

            -- undirected (can travel in both directions on a road segment)

               false)) as route, ways_vertices_pgr as vertices

where route.node = vertices.id

 

Now that we have a polygon for the hull around the points / line segments, we can use a spatial intersection to answer some questions about our data – like:

  • How many {customers, patients, delivery vans, rock quarries, etc.} are within x distance from a location?
  • What areas aren’t being served by a set of facilities? For instance, what areas are inside and outside a five minute drive from each fire station?

 

To answer these types of question, you can just use the polygon in a spatial join and sum up points inside or outside the polygon:

or

 

All in all, this method of calculating ‘drive time’ areas using pgRouting driving distance calculations has some utility as an estimate of area covered within set distance, time, etc. of an origin location.  If we add in parameter actions we can then build up a workbook that allows click to select origin location, with analyses that update:

isochrone_animation.gif

 

This gives us a quick result, but it will have some limitations in terms of accuracy due to the convex or concave hull.  For instance, it tends to over-estimate the area within the ‘drive-shed’ in order to draw the polygon quickly.  This is especially true with irregular road networks where there are gaps in the network (like lakes):

 

There is potentially a fix for this sort of problem, but it wasn’t quickly obvious from my reading through various pgRouting and PostGIS files.  My guess is that whatever the solution is, it won’t be fast to do it with PostgreSQL….and that it would be much faster to do something cleaner with some other set of spatial tools (e.g., a nicely put together spatial library for Python, C++, etc.).

 

And, of course, there are the Mapbox APIs which you can use, though, even with Nick Hara’s blog post on the topic, I haven’t figured out the way to have both interactivity in generation of the drive time isochrones and the ability to use them in quick spatial joins in Tableau.  There is room for more thinking on this topic, and I’m going to keep running it around in the back of my head as time permits.

The other day someone asked me about whether Tableau could generate routes between locations as part of the spatial analysis functionality.  As a built-in function, no…but I’m always up for a good challenge, so I built up a demo for dynamic route calculations using a PostgreSQL database with PostGIS and pgRouting.  For fun, I’ve also tied it to some parameter actions so that it’s easy to click on a point to update the origin or destination of the route.

Sound like fun?  Want to try it yourself?  Here’s some detail on how it all works…

seattle_routing_animation.gif

 

Summary of the process:

  • Set up a PostgreSQL database with the PostGIS and pgRouting extensions enabled
  • Add in a topological roads dataset (I use data from OSM, it’s pretty much ready to go for routing)
  • Create a Tableau workbook with some custom SQL to drive the route generation using the Dijkstra algorithm implemented in pgRouting
  • Bling up your workbook with parameters (and parameter actions) and fancy symbolization

 

Ingredients:

While we’ll visualize our calculated routes in Tableau, the analytics behind the scenes will be driven by a PostgreSQL database with PostGIS and pgRouting.   To simplify this post, I’m going to assume that you are set up with a few tools:

  • PostgreSQL – I installed the latest version from EnterpriseDB
  • PostGIS – I installed from OSGEO (this will get you postGIS, osmconvert, and osm2pgrouting)

 

You’ll also need some network data to work with.  All of the data that I use in this demo are from Open Street Map (road network and points of interest).  If you want to use OSM data, I recommend:

  • OSM HOT Export Tool – you’ll need to create an account, but then it’s super easy to set your area of interest and grab all sorts of data
  • osmconvert – use the Windows 32bit version (this will run on Win 64 as well).
  • osm2pgrouting – if you install PostGIS using the OSGEO bundle you should already have this ready to go

 

If you want to make your own network from non-OSM data, it’s totally do-able, just not as easy as using the tools above.  Maybe I’ll figure it out one of these days and write another blog post on it.

 

Routing in Tableau

For the sake of expediency, I’m not going to go through all of the gory details of installing PostgreSQL, etc.  I’m going to assume that you’re already set up with the required tools and will go straight to the process of grabbing data, converting OSM data to something routable with pgRouting, and then exploring in Tableau.

 

Here are the steps that we’ll walk through:

  1. Export point and road network data from OSM
  2. Process the exported data (.pbf file) and create an .osm file
  3. Import the data into your PostgreSQL database and convert into the format expected for pgRouting queries using osm2pgrouting
  4. Set up custom SQL in Tableau workbook to drive route calculation

 

1. Export data from OSM

The most important part of routing is having a good set of network data to work with.  Your network data must have some topological smarts in order for the pgRouting functions to work (e.g., the line segments all know how they are connected to one another).

 

I’m sure there are many places to obtain good network data, but the one that I use is Open Street Map.  The road network from OSM is already in a good format to use with pgRouting and needs minimal processing.  I also like the OSM HOT (Humanitarian OpenStreetMap Team) Export tool. I’ll walk through the process of exporting data, but if you want to just use the dataset that I worked with for this post, you can grab it from OSM using this link if you want to look at the specifications of the selection – note that you’ll probably still need to make an account with OSM to download it, or you can just take the zip file of the pbf that I’ve attached to this post.  If you opt to download your own data, it’s easy with the HOT Export tool. You can just select your area of interest:

Select from a variety of data file formats for export (I’m using OSM .pbf files, because that is what works with the osm2pgrouting tool):
Pick what OSM data you want – there is a LOT to choose from.  I’m keeping things simple and just grabbing roads and restaurants:

And then create the export. It may take a few minutes to run off the export, depending on the size of the area and number of features you are exporting.  My Seattle export only took a minute or so.

 

2. Process the data

Your exported data should be a .pbf file.  Next step is to convert it to a .osm file (which we will then push directly into PostgreSQL). The easiest way to do this is with the osmconvert tool.  You can download the tool from OpenStreetMap and drop the .exe in the same folder as your .pbf file and run it from the command line.  Easy peasy:

 

osmconvert Seattle_OSM_Data/seattle_export.pbf --drop-author --drop-version --out-osm -o= Seattle_OSM_Data/seattle_export.osm

 

Now we have the .osm file we need to import into a database.  Next up, create the database and drop in our data.

 

3. Import to PostgreSQL for use in pgRouting

 

Assuming you already have PostgreSQL and PostGIS installed, you need to start out with a database that is ready for routing…

 

Make a database (from command line using create database or through the pgAdmin interface):

 

Open a SQL query window and enable some extensions in this new database – this will enable some basic postGIS spatial analytic functionality, the routing functionality in pgRouting, and the hstore extension.

 

 

CREATE EXTENSION postgis;

CREATE EXTENSION pgrouting;

CREATE EXTENSION hstore;

 

The last step is to convert your .osm file into all the little tables we need in PostgreSQL to do routing with pgRouting.  You could probably do this by hand with a billion commands, or you can just use osm2pgrouting.  This will grab the data you want from the .osm file and push it into all the right tables in your database.

 

Note that the osm2pgrouting tool doesn’t import all of your data into the database; it only takes what you tell it to.  To tell it what data to grab, we’ll create an XML file with the configuration that we want for the data.  It just specifies which tags to look for.

 

I patterned my XML file for routing after this one and made some modifications based on the points of interest that I downloaded (restaurants).  My new XML looks like this (mapconfig.xml attached to the post if you want to just download and use that):

Or, if you want to copy/paste to build off of:

<?xml version="1.0" encoding="UTF-8"?>

<configuration>

  <tag_name name="highway" id="1">

    <tag_value name="motorway" id="101" priority="1.0" maxspeed="130" />

    <tag_value name="motorway_link" id="102" priority="1.0" maxspeed="130" />

    <tag_value name="motorway_junction" id="103" priority="1.0" maxspeed="130" />

    <tag_value name="trunk"             id="104" priority="1.05" maxspeed="110" />

    <tag_value name="trunk_link" id="105" priority="1.05" maxspeed="110" />   

    <tag_value name="primary" id="106" priority="1.15" maxspeed="90" />

    <tag_value name="primary_link" id="107" priority="1.15" maxspeed="90" />   

    <tag_value name="secondary" id="108" priority="1.5" maxspeed="90" />

    <tag_value name="secondary_link" id="109" priority="1.5" maxspeed="90"/> 

    <tag_value name="tertiary" id="110" priority="1.75" maxspeed="90" />

    <tag_value name="tertiary_link" id="111" priority="1.75" maxspeed="90" /> 

    <tag_value name="residential" id="112" priority="2.5" maxspeed="50" />

    <tag_value name="living_street" id="113" priority="3" maxspeed="20" />

    <tag_value name="service" id="114" priority="2.5" maxspeed="50" />

    <tag_value name="unclassified" id="117" priority="3" maxspeed="90"/>

    <tag_value name="road"              id="100" priority="5" maxspeed="50" />

  </tag_name>

 

  <tag_name name="amenity" id="2">

    <tag_value name="restaurant" />

    <tag_value name="fast food" />

    <tag_value name="cafe" />

    <tag_value name="pub" />

  </tag_name>

 

</configuration>

 

Once you have your .xml file ready, you just have to run osm2pgrouting from the command line. Note that I am using a direct link to the location for the osm2pgrouting.exe.  In the text below, I also put all of the assorted flags on separate lines below so it was easier to read, but it is really all just one long single command (see the image below).

"C:\Program Files\PostgreSQL\11\bin\osm2pgrouting.exe"

--f Seattle_OSM_Data/seattle_export.osm

--conf Seattle_OSM_Data/mapconfig.xml

--dbname routing

--username USERNAME
--password PASSWORD

--clean

–-addnodes

 

 

It may take a while to load up all of the data into your database (depends on size of the dataset…the Seattle dataset takes less than a minute on my computer, but it’s not a huge dataset) .

 

After you do this, you should end up with a set of tables like this – all of these tables are going to be used by pgRouting to figure out how to traverse your road network and figure out the shortest path:

What are these tables? Here is a quick run down of the main tables…

 

Ways – this is the primary table we’ll use for routing. The table has all of the segments of the road network with attributes that we can use for calculating the ‘cost’ of traversing each segment (e.g., length, speed, whether or not it’s one way, etc.)

Osm_ways – Segments of the road network with simplified attributes

Osm_nodes – a ton of points.  All of our amenities + it looks like all of the nodes at the start/end of each road segment.

 

 

Pointsofinterest – this has all of the ‘amenity’ points of interest that we grabbed from OSM (left image below) as well as a ton of other points that have null attributes (right image below; I have no clue what these are, but I probably accidentally grabbed extra data or they are an artifact of the road network):

To make things easy on myself, and to speed up querying, I decided to streamline the pointsofinterest table to make a new table with just the ‘amenity’ points of interest.  This is a quick SQL query that reduces us from 22,416 rows of data (pointsofinterest) to 2,068.

 

 

create table pointsofinterest_amenity as

select *

from pointsofinterest

where tag_name = ‘amenity’

 

Once you have your data all set up, all we need is a little custom SQL to make it the basics work in Tableau.   While we’re at it, we will also add some extra bling with Parameter Actions and fancy symbolization…

 

4. Setup in Tableau

We’ll start by just looking through the data and then will move to some queries to explore routes in Tableau.

 

Connect to your database in Tableau:

 

All of the data for this has to live in the PostgreSQL database so that we can run custom queries against it.  So, we’ll start by connecting to the database and exploring the data.

 

 

Explore the content:

As noted above, the most important table for routing is ‘Ways.’  This is the table that stores all of the good info like segment geometry, unique vertex ids for the start and end node of each segment, length, speed, whether or not the segment is one-way, etc.

 

If you open up the table in Tableau and add the geometry (the_geom) to a worksheet you’ll see all of the road segments in the dataset.  For context, it helps to use a basemap with the road network visible (add streets).

 

 

Here is a subset of the ways table in our database:

 

 

Each of those line segments has a ton of attributes attached to it:

 

The other super important table is the pointsofinterest_amenity table that we made. These are the amenities between which we will calculate routes… how do I get from one restaurant to another? 

 

Calculate a route

So, how do we use these tables to find the path from one location to another?  We need to write a simple bit of SQL to calculate the route using functionality in pgRouting.  We will use the Dijkstra algorithm (pgr_dijkstra).  This will return the shortest path from a starting vertex to ending vertex.

 

Let’s try this out with some custom SQL…

 

In a new worksheet, let’s add another copy of our database as the data source and write some custom SQL:

Select * from pgr_dijkstra(

   'select gid as id, source, target, length as cost

from ways', 44479, 29980, false)

 

This gives us a result table with an ordered list of edge segments from vertex 1 (44479) to vertex 2 (29980), as well as some basic information about the ‘cost’ of the path (in this case it’s just based on length of the segment).  This is a good start, but it isn’t a path that we can visualize easily in Tableau, because there is no geometry!   We’ll need to add that in.

 

While I guess you could use a join in Tableau to join the edges back into this result table using the node ID, I think it’s easier to just put it in the query to begin with… all we do is going the geometry from our ways table to the result of our routing query using the geographic ID of the road segments.

 

select the_geom as geom_route, gid, source, target, length

from (

    Select * from pgr_dijkstra(

   'select gid as id, source, target, length as cost

from ways', 44479, 29980, false)

) as route, ways

where route.edge = ways.gid

 

With this new query, we now have our geometry as well as some interesting attributes like segment length.

 

 

We can map this easily in Tableau by just dropping the geom_route onto a viz.

 

Proof of concept complete – we can use PostgreSQL, PostGIS, and pgRouting to calculate routes in Tableau…but how do we make it dynamic?  I don’t want to type in the start and end nodes manually each time I need to run the query…and I don’t even want to have to look up the start and end nodes on the road network!

 

Automatically determine the corresponding nodes for locations of interest

 

First thing we can do is add in some subqueries to take the location of a point of interest and return the appropriate node on the road network.

 

To find the appropriate node, I use a quick subquery query like this to select the source ID value from the ways table where the geometry is closest to the selected point of interest.  The query does some calculations of distance between the node with the specified source ID (osm_id) and the roads in the ways table, then selects the 1 road that is closest (the ‘limit 1’ in the query).  

In this query, I’m also using a parameter in Tableau (highlighted below) to drive the selection of the point of interest ID using a parameter drop down so that I don’t have to type that in each time. We’ll set up that parameter next so you can see how it works… I’m just going to do a quick explanation of this part first and then we’ll put it all together into a single query in the next section.

 

--source location

(select source

from ways

order by ways.the_geom <-> (

-- select one node

SELECT the_geom FROM pointsofinterest_amenity WHERE osm_id = <Parameters.Origin Name>

)

limit 1)

 

The query to select the destination location would look exactly the same, except we will change to use a parameter to set the Destination Name.

 

To explain one of the weird parts of the query… what does <-> mean???  The <-> returns the distance between two geometries. By ordering the results and then taking the top result (“limit 1”) we can get the nearest neighbor.  Keep in mind that this method is actually finding the nearest way (road segment) and will then snap the point to the origin of that. This means that sometimes you’ll find that the selected origin node is up to a block away…for instance the starred location below is closest to NE 80th St, and the origin of the segment that it’s closest to is actually at the intersection at the other end of the block. For my purposes this was sufficiently close that I didn’t look into other solutions to ensure that the point snapped to the closest intersection, but you may want to put more time into this if it’s important for your application.

 

Automatically update the start and end node in the query

As I mentioned in the last section, I used a parameter in the query above, so let’s quickly look at setting that up so that our query can be dynamically driven from the viz.

 

In order to select the origin and destination locations using a parameter drop down on the viz, we need to create the parameter and populate it with useful information. Ideally, users will have a list of familiar names to use when selecting the location in the drop down and Tableau will have a set of unique IDs to use to pass that information on to our SQL Query.

 

To allow us to do this, we need to create two parameters: Origin Name and Destination Name.  For each of these, we can populate the list of values with the Name and ID from our points of interest.  To make it easy to create the list of values, you’ll want to just copy/paste the values into the parameter.  There are a lot of ways to do this, but Tableau can help make it a bit easier.

 

The way that I normally do this is to make a viz with just the point locations for the points of interest, then selected them all, look at the data table, copy the value and display name from the data table in Tableau, and paste them into Excel (or a text editor).

 

 

Once I pasted these into Excel, I pulled out the nulls and empty rows, and concatenated the ID to the restaurant name (not needed in all cases, but I did it here because some of the restaurant names are duplicated since they are chains and Tableau parameters need to be unique). Once my data was cleaned up, I copied the two columns and pasted the values into the parameter from the clipboard.  Now we have a parameter that we can use for selecting points of interest and getting the ID value easily.  We can just duplicate this parameter and rename it so that we have one each for selecting the origin and the destination.

 

Now we have a quick way of obtaining the ID we want in the custom SQL query and we can drop in the appropriate parameters in our query.  Here is what that looks like for adding in the Destination name parameter and putting everything into a single query.

 

Now we can put in a single query and obtain the route between any two selected locations:

select the_geom as geom_route, name, osm_id

from (select * from pgr_dijkstra('

select gid as id, source, target, length as cost from ways',

                   --source location

                   (select source

                   from ways

                   order by ways.the_geom <-> (

                        -- select one node using parameter

                        SELECT the_geom FROM pointsofinterest_amenity WHERE osm_id = <Parameters.Origin Name>

)limit 1),

--destination location

(select source

from ways

order by ways.the_geom <-> (

-- select one node using parameter

SELECT the_geom FROM pointsofinterest_amenity WHERE osm_id = <Parameters.Destination Name>

)limit 1), false)) as route, ways

-- join the geometry back into the query result

where route.edge = ways.gid

 

Now we can build up our viz with parameter drop downs to guide the routing:

 

When you first create your viz and run the query you might get an empty table – most likely that is because if you duplicated the Origin Name parameter to create the Destination Name parameter the two are set to the same value; the route from a location to the same location doesn’t exist, because they are in the same place:

 

Just change the value for one of the parameters and you should see a route.

 

Improved symbolization with dual axis mapping

While the routes are cool, they would be even better if we had both the routes AND the points of interest to add additional context.  We’ll need a dual axis map to get both of those on the same map.  A bunch of tips on how to do that can be found in another blog post on Dual Axis Mapping Many Ways.  But, here is the quick “how to” with the data in this post – we just need a full outer join between the results of our custom SQL query and the points of interest table with the join clause set to 0=1 so that we have all rows from both tables:

Now we can build up a dual axis map with both points of interest (the little blue dots) AND the route from the custom sql (the red line):

 

We can get crazier with symbolization as well and highlight the selected origin and destination with some use of the size and color shelves (and a few calculated fields):

 

 

 

 

Improved selection with parameter actions

The parameter drop downs are nice, but it would be even better to just select a location on the map and set that as origin or destination automatically.  We can do that using parameter actions (Tableau 2019.2+).

 

It is easy to set up the parameter action – just add a new action to the worksheet

 

set it as type ‘Change parameter’

 

 

and have it run based on a field that matches the value in the parameter.  In this case it is the OSM ID as a string.  To get a match to the value, I had to change the data type of the OSM ID to a string. With other datasets you may not need to do this…you just need a field that is going to match the parameter name exactly…and you need to make sure the field is on the viz (I dropped my new field on Detail).

 

Both of the parameter actions are set to run from a MENU so that we can click on a point and have a choice of whether it is set as origin or destination:

 

 

Once we pick one of the options, the query will re-run and the path will update on our viz:

 

 

And that is all of the magic to do routing in Tableau!

 

Let me know what you create!