Skip navigation
2017

Sarah Battersby's Blog

May 2017 Previous month Next month

In Part II: Data manipulation, we walked through an example to convert a spatial boundary file for Canada so that it was in the Statistics Canada Lambert coordinate reference system. In Part III, we are going to use the same process to project a US dataset, but we’re going to add the special bonus of moving polygons to new locations!  For use in Tableau, this comes in handy when you have spatial data that is distributed across a large space and you want to have all of your geography in one worksheet and the true spatial relationships do not matter

map_tableauPublic.png

 

Before I go into instructions on how to do this, I want to reiterate my disclaimer / warning from Part II, with a few minor updates:

 

DISCLAIMER
You want to be really careful here – you’re about to edit your Shapefile so that it has the wrong coordinate system listed for it (and/or move geographic data to the wrong place in the world).  Label the files that you are working with carefully; you do not want to accidentally try to use this in analyses outside of Tableau. We really are telling a white lie with data here and this is just a trick so that you can have a different ‘shape’ map in Tableau for visual analysis, it is not a pathway to success in general spatial data analysis.    

 

Ingredients needed:

  • A spatial data file (Shapefile, JSON, etc.) [the example data that I used came from the US Census – I used 2016 Counties (and equivalent)]
  • QGIS – a FREE and open-source Geographic Information System (or you can use whatever GIS, database, programming language, or other tool you prefer for manipulating spatial data…there are plenty of options)

 

Caveats:

  • If you use this trick you will need to make the Tableau basemap fully transparent (it has to disappear) or make your own custom tiles in Mapbox.  It is best if the dataset you are working with is polygons that are familiar enough in shape that you don’t need the extra spatial context provided by the Tableau base map.

 

  • You will not be able to use the Tableau Map Search.  Remember from Part I, we are going to move things around on the base map, so if you search for a location Tableau will take you to where that spot is in Web Mercator coordinates, which may be very different than where your data is on the map. But, you can use filters or a highlighter to allow people to find locations based on a dimension in dataset (so, in a sense, create your own custom map search based on the locations in your dataset).

 

If you’re feeling lazy… I’ve put my completed dataset up on Tableau Public for reference.

 

 

If you’re feeling adventurous, let’s dig in and make a new map!

 

Open your spatial file in QGIS.  Click on Layer -> Add Layer -> Add Vector Layer and select the .shp file, or other spatial file format of choice, that you want to work with (either your own file or the example data file linked above)

 

If needed, delete unnecessary bits.  Since the US county dataset includes territories as well as states, I am going to clean up a bit before we do our heavy editing.  This editing can be done graphically (Click on the pencil icon to start editing; use the selection tool to select the polygons to delete; then delete), but to save time, I’m going to do my selection in the attribute table:

Start editing – click on the pencil icon in the menu:

start_editing.png

Open the attribute table – right click on the layer name and select ‘Open Attribute Table’:

open_attribute_table.png

 

Click on the header in the ‘STATEFP’ column to sort ascending. Select all of the rows with a STATEFP value greater than 60 and then delete. 

 

Click on the pencil again to stop editing; save your edits.

 

We should now have a shapefile with counties from just the 50 US states.

 

 

Check the projection of your dataset.  As we saw in Part II, so long as you have a projection defined (or know the projection and can set it), you are good to go.  The Census data that we are working with in this example is NAD83 (North American Datum of 1983), and has an EPSG code of 4269.   You can read up on those acronyms in Part II, or through the links that I attached to the key terms.

crs_original.png

Project your dataset into your preferred coordinate system.   This part may take a bit of trial and error, based on the dataset that you are working with.  You want to pick a projection that looks good for all of your regions.  For the US dataset, I find that a Lambert Conformal Conic is quite nice.  Note that this was just a ‘select based on look’ process, and not on the technical characteristics of the projection. 

To check out the look of different projections, click on the CRS button in the lower right corner of QGIS:

crs_button.png

Check the box at the top of the window to ‘Enable ‘on the fly’ CRS transformation.’  This will allow you to update the projection / coordinate reference system for the entire view in QGIS.  Search for Lambert and select the USA Contiguous Lambert Conformal Conic.

crs_setLambert.png

 

You should now have a US that looks something like this:

us_lambert.png

To save the geographic coordinates in this projection (not just draw them in this coordinate system), we need to save a new copy of the shapefile and make sure QGIS saves the coordinates in the new coordinate system. First, right click on the dataset name and Save As…  Browse to the folder where you would like to save the file, and set the CRS to Lambert Conformal Conic.  Uncheck the ‘Add saved file to map’ checkbox.

save_as_lambert.png

 

Delete the projection information.  QGIS is smart and when we start moving around polygons to new places on the map, it will update the coordinates for the polygon based on the distortion patterns in the projection – so the nice shape of the polygons in our projection may not be quite so nice when we move them.  We’re going to avoid that problem and make QGIS forget that it knows what the projection for the data is (if you want to see what happens if you don’t do this, just ignore this step, move to the next step in the instructions, and have fun watching the shapes warp as you move them around). 

 

Find the new file that you saved and delete the .prj and .qpj files.  This is just removing any of the files that QGIS would use to identify the projection for the dataset. 

prj_qpj.png

 

Back in QGIS, open a new project (Project -> New).  You do not need to save the changes to your current project.

 

Add your newly projected file (the one where you just deleted the prj and qpj files). When you add this to QGIS you should see a warning that there is no coordinate reference system defined.  If you don’t see this, remove your file and go back and make sure that you deleted both the prj and qpj file.

crs_warning.png

 

Move some geography!  To move Alaska and Hawaii to your preferred spot we just need to edit, select the geography, move to a new location, and (maybe) re-scale so that it’s a good size. 

Start editing by clicking on the pencil icon.

 

Select the geography that you want to move – let’s start with Alaska.  The easiest way to get all of the little bits (darned Aleutian Islands), the easy way is to select by an expression. 

select_by_expression.png

We want to select all of the polygons where the STATEFP is ‘02’ (that is the official state code for Alaska):

select_by_expression_statefpis02.png

 

All of Alaska should now be selected.  It’ll be highlighted in yellow.  It may be hard to tell when zoomed out to the full extent of the dataset, since in edit mode all of the vertices are shown as big red x.

 

With the Alaska counties selected, use the Move Features tool

move_features.png

Grab Alaska, and drag it to approximately where you want it.

AK_move1.png

 

I think this is too large, so I’m going to re-scale the geography.  I use an Affine Transformation tool to scale the geography. If you don’t see an Affine Transformation tool in your Vector toolset, you will need to add it (Plugins-> Manage and Install Plugins; then search for affine transformation)

 

affine_transform_tool.png

Note that we’re really getting into lying with the data now – besides the distortion inherent in the projection, and the moving of geographies, we’re also now changing the relative size… So, I’m going to take a second to re-iterate and rephrase my original disclaimer: You are taking liberties with geographic data here. This is entirely for graphical benefit. Spatial relationships (size, distance, shape) are being altered!

 

Transform both the X and Y axis by 0.5 and click Transform. 

affine_transform_AK.png

 

This may take a minute to finish…there are a lot of vertices to update.  When it’s done, you should have a half-size Alaska.    You will probably need to use the Move tool to update the location after it’s been scaled. My data before and after moving Alaska:

AK_move2.png

 

Now select Hawaii (“STATEFP” = ‘15’) and do the same process.  Hawaii probably doesn’t need to be scaled quite so much.

 

When you’ve finished, save the edits by clicking on the pencil icon again and saving.  

 

One last step – Make Tableau think this dataset is in Web Mercator coordinates.  Open the properties for your edited file (right click on the name -> properties) and where it says that the Coordinate reference system is EPSG:4326, WGS84 (this is the default QGIS uses when it doesn’t know the real coordinate system)…go ahead and click on the Select CRS button and change the projection listed to WGS84 / Pseudo Mercator (EPSG: 3857).

change_crs_final.png

 

Now right click on the name of the dataset and Save As… give the file a new name, make sure that it is using the EPSG:3857 Web Mercator (or Pseudo Mercator) projection. 

 

Add your new dataset to Tableau, set the washout on the base map to 100%, and enjoy!  To add in attributes, just join them in Tableau using the FIPS code (COUNTYFP is the county-level codes in this dataset...note that these are STRINGS and have a leading 0 so that all state codes are two digits)

map_inTableau.png

Sometimes people send me questions about how to work with custom map projections in Tableau.  For instance, Alan Eldridge asked me about working with polar ice data. Or, sometimes I get questions about making a US map with Alaska and Hawaii in a little inset in the corner of your map – without needing to use three worksheets, or how to make a map of Canada that is in the Lambert Conformal Conic projection (the most common projection used at Statistics Canada) instead of Web Mercator... It turns out that all of these things are possible in Tableau:

polarIce.pngus_ak_hi.pngCanada.png

 

Since working with map projections is a pretty big topic, and something that I like quite a bit, I’ve written up a few blog posts on it.  I’ve broken them into two parts: map projection basics and data manipulation for Tableau.  The two parts should stand alone, so if you want more background about what is going on under the hood, you can start here with Part I (map projection basics), and if you don’t care about the basic mappy science that goes behind the fun tricks, you can just jump to Part II (data manipulation)

 

Since we have a lot to cover, let’s get rolling and go through enough projection basics (in general) for you to be dangerous be creative with map projections in your visualizations.

While I’ll give a brief overview of what happens in the projection process, I’m going to focus on a small set of topics important to working with alternate map projections in Tableau.  For more in-depth discussion of projections, I recommend:

 

What is a map projection?

Map projection is the process of transforming angular (e.g., latitude and longitude) coordinates into planar coordinates.   The projection is just a mathematical definition of how we go from the angles (spherical or geographic coordinate system) into a planar, projected coordinate system (the good ol’ Cartesian coordinates that you might remember from elementary school).

lat_lon.pngplateCarre.png

 

 

The fun maddening part of map projections is that whenever we do this transformation from 3-dimensional angular coordinates to 2-dimensional planar coordinates we introduce distortion in the form of one or more of the following: shapes, areas, angles, and distances…and just to make things more fun with distortion, know that it’s impossible to preserve both areas and angles on the same flat map!

Why do we have these distortion problems to deal with?  I find that an easy way to think about this is to peel the skin off an orange.  Is there any way to remove the skin and end up with a completely flat, un-torn, un-mangled peel? Even if you can manage a one-piece skin that looks pretty flat when you squish it down do you think it would look good as a map?  Probably not. See my attempts:

 

orange.png

To bring that back to the map, think of a map projection as the mathematical warping, stretching, tearing, and stitching back together of the orange peel / Earth surface.  There are infinite ways of doing this, so the possibilities are endless with respect to what your final map looks like. For instance:

 

4projections.png

 

When making maps, a good cartographer will try to do two things: 1) minimize distortions so that their map is as accurate as possible for the location being mapped and the purpose of the map (e.g., do you need to take angular measurements?  Are distances really important?  Are relative areas really important?), and 2) make the map look nice.

Equal area projections with two different looks…I like one more than the other…

2_equalArea.png

Most geopolitical entities (countries, states, provinces, counties, etc.) will have specific projections suggested for use in order to minimize the distortion in that area. It’s common to see the same general projection used in different locations, but with slight modifications to the parameters of the projection to tailor it to the location.  Here is a Lambert Conformal Conic a central meridian of 0° and standard parallels at 20° and 50°:

lambert.jpg

 

 

For regional data, you might find Canadian data in this very same Lambert conformal conic map projection, but with standard parallels at 49° and 77° and with a central meridian of 91° 52’ W (Statistics Canada details here; image on the left below), while for the conterminous United States (the ‘lower 48’), you might see the same projection used, but with standard parallels of 33° and 45° and a central meridian of 96° W (image on the right below).  They are similar, but not the same – look at the stretching in the far north for a good visual indicator.  

2_canada.png

Or, consider them arranged on top of one another.  The projection with the parameters set to optimize for Canada are in orange, and for the US are in blue:

2_canada_overlay.png

 

While cartographers typically like to pick different projections based on the location they are mapping, in many general purpose, non-geographic information system (GIS) mapping tools, there is only one projection to choose from and that one projection has to work for mapping local-scale (e.g., city), global-scale, and everything in-between.  This projection is Web Mercator.  I will use this particular projection as the example moving forward to explain the basic principles behind bringing your projected spatial data into Tableau.

Well, hello Web Mercator.  It’s nice to meet you!

Tableau uses the Web Mercator map projection for mapping; this is the same map projection that you’ll see in pretty much every web mapping service (Google Maps, Yahoo Maps, etc.).  While some people have strong negative opinions of this projection due to the distortion in the high latitude regions, for better or worse, it has become the standard for online mapping.  

 

Why is this particular map projection so popular??  Most map services are designed for mapping from local to global scale – and we have to pick a single projection to work with, nothing will be optimal and this projection has some benefits, like:

 

  1. it’s rectangular so it is easy to chop into the individual little map tiles that nest inside each other for viewing at different scales,
  2. the math behind it is simple and quick,
  3. because it is a rectangular projection, North is always the same direction (typically the top of the map), and
  4. because it is (nearly) conformal (preserves angles on the map), for local-scale mapping we don’t see things like distortions of the angles at which roads intersect.  There are some other reasons, and if you are really a glutton for map projection punishment, I can recommend a bit of reading that you might enjoy (Implications of Web Mercator and its use in online mapping)

 

Now let’s get to the fun stuff – what do you need to master if you want to make maps in Tableau that are not displayed in Web Mercator coordinates? It’s really just remembering a simple property of map projections – that the projected coordinates are just one big Cartesian coordinate system, and that in order to draw your geographic data, the system (Tableau, a GIS, whatever) displaying your data doesn’t care what projection you are using, it just wants to know what the x&y coordinates for every vertex in your data and then each vertex is dropped onto the right spot on the Cartesian coordinate system.

 

What does that mean to you? It means that you get a big blank ‘canvas’ on which you can draw your data, and so long as your data set’s coordinates are within the right range (and your coordinates think they are in the same coordinate system as our base map, Web Mercator), you should be able to drop any polygons you want onto the map.  Here is what that map ‘canvas’ looks like in Tableau with our Web Mercator base map underneath:

 

grid1.png

 

Take, for example, the coordinates for the conterminous United States (‘lower 48 states’)….here is the same dataset in a few different map projections – note the differences in shapes and bounding box coordinates listed for each:

 

Pretty pictureDetails
plateCarre_small.png

Plate Carrée - Treats spherical coordinates as planar coordinates

 

Bounding box for the US:

xMin, yMin (-124.7, 24.9)

xMax, yMax (-66.9, 49.4)

 

  The range will be between  -180 and 180 for x and -90 and 90 for y (degrees of longitude and latitude)
wm_small.png

Web Mercator

 

 

Bounding box for the US:

xMin, yMin (-13885338.72, 2870437.22)

  xMax, yMax (-7454921.06, 6338171.97)
albers_small.png

Albers Equal Area

 

Bounding box for the US:

xMin, yMin (-2235903.34,-1643504.03)

xMax, yMax (2125235.69,1322165.76)

 

You should be able to see in the bounding box coordinates that they would each draw in different locations on our ‘canvas,’ but let’s go ahead and throw them all into a Tableau base map to see where they fall.  Before we do this, think back to our graphic of the coordinate system that we’re working with, take a look at the bounding boxes listed above for each of the datasets (in projected coordinates), and imagine where they will fall on the map. Then keep reading to see if they line up with what you expected.

 

Plate Carrée – when zoomed to the extent of the data it looks just as expected.

plateCarre_inTableau.png

But when we zoom out, something strange happens…our states disappear!  Why?  Think about where data with this bounding box falls in a coordinate system that runs from -20mil to +20mil:

xMin, yMin (-124.7, 24.9)

xMax, yMax (-66.9, 49.4)

 

Yes, it becomes a super-tiny set of polygons right around 0,0.

 

plateCarree_inTableau_zoom.png

 

Web Mercator – it shows up in the right place…because the coordinates are just right for dropping that US shape on top of the US in our Web Mercator base map

 

wm_inTableau.png

 

Albers Equal Area - looks good, but totally in the wrong place! Why?  Because this particular variant of the Albers projection was tailored to be centered on the US, which puts 0,0 right in the middle of the US.

Albers_inTableau.png

 

The big so-what…

Map projections just define the translation from angular to planar coordinates.  Once your data is in planar coordinates, so long as the range on those coordinates fits inside the extent of our Web Mercator map tiles, you can put any data you want onto the map.  But…as you can see above, the Tableau base map tiles may not provide quite the right context underneath your map data.  

 

In my next post, I’ll go through an example of how to adjust projections on your spatial datasets and bring them into Tableau.

In Part I: map projection basics, we took a look at what map projections are in general, learned about the map projection that is used in the Tableau base map (Web Mercator), and saw a few examples of what a single dataset represented in different coordinate systems would look like if it were drawn in the Web Mercator coordinate system.

In Part II, we’re going to walk through an example of how you can move away from the Web Mercator projection with some creative data adjustment.

 

Ingredients needed:

  • A spatial data file (Shapefile, JSON, etc.) [the example data that I used came from Statistics Canada]
  • QGIS – a FREE and open-source Geographic Information System (or you can use whatever GIS, database, programming language, or other tool you prefer for manipulating spatial data…there are plenty of options)

 

Caveats:

  • If you use this trick you will need to make the Tableau basemap fully transparent (it has to disappear) or make your own custom tiles in Mapbox.  It’s best if the dataset you are working with is polygons that are familiar enough in shape that you don’t need the extra spatial context provided by the Tableau base map.

 

  • You will not be able to use the Tableau Map Search.  Remember from Part I, we are going to move things around on the base map, so if you search for a location Tableau will take you to where that spot is in Web Mercator coordinates, which may be very different than where your data is on the map. But, you can use filters or a highlighter to allow people to find locations based on a dimension in dataset (so, in a sense, create your own custom map search based on the locations in your dataset).

 

So, now let’s dig in and see how this all works!

 

Open your spatial file in QGIS.  Click on Layer -> Add Layer -> Add Vector Layer and select the .shp file, or other spatial file format of choice, that you want to work with (either your own file or the example data file linked above)

partII_qgis_addLayer.png

 

 

Check the projection of your dataset.  The short story here is that so long as you have a projection defined (and that it is correct), you are good to go.  If you want to know more about the projection of the dataset you are working with you can right click on the listing for your dataset in the legend on the left side of the QGIS window and select Properties.  In the properties window, look under General and Coordinate reference system.

 

partII_qgis_checkPrj.png

 

The Coordinate reference system information for our sample file tells us that this is “EPSG:4269 – NAD83.”  Whaaaa? This information is actually pretty simple to decode once you know the trick:

EPSG = European Petroleum Survey Group.  They catalogue definitions for map coordinate systems and assign each a unique reference code.

4269 = The reference code for this particular coordinate system.  You can get more information at the for this particular code at the spatialreference.org website

NAD83 = North American Datum of 1983.

 

You can also find more information about the coordinate system by looking at the metadata properties:

partII_qgis_prjProperties.png

 

What I read here is that the dataset has a bounding box that is within the range I would expect for latitude and longitude (longitude: -180 to 180, latitude: -90 to 90), so even if I didn’t know that the dataset was in EPSG:4269 I could make a good guess that it was just plain ol’ latitude and longitude coordinates.

 

 

Project your dataset into your preferred coordinate system.  We could bring this Shapefile into Tableau directly using the spatial file support in 10.2…and it would work just fine:

partII_shpInTableau.png

 

But…I assume that since you’re reading this, you really want to see the data in a different coordinate system – like, perhaps, the coordinate system recommended by Statistics Canada (like the Statistics Canada Lambert, which happens to be defined by EPSG:3347).

To update the projection for our dataset, right click on the dataset name in the legend on the left side of the QGIS window and select ‘Save As…’

 

partII_qgis_saveas.png

Give the file a new name, and change the CRS to ‘Selected CRS’ and then click on the ‘Change…’ button.  This will open the Coordinate Reference System Selector (shown on the right below).  Find the coordinate system that you want to use – I just typed in the EPSG code (3347) because I happened to know it.  Otherwise you can scoll through the large number of coordinate systems supported in QGIS and select the one you like.

partII_qgis_saveAsChangePRJ.png

 

Check out the properties of your new file.  If you look at the properties for the new file that you saved, you should see updated information for the coordinate system and for the bounding box listed in the metadata.  Use the same steps we used in #2 above. 

Note how much larger these coordinates are and that the definition states that the coordinates are now provided in ‘m’ (meters)

partII_qgis_layerProperites.png

 

If you open a new file in QGIS and add this file, you’ll also notice that it looks different than the original:

partII_twoCanadas.png

 

Note that we could bring this NEW file, in the new coordinate system, into Tableau 10.2 and it would also work just fine.  Tableau is smart enough to look for the projection definition in your file and do the conversion to put it into the same coordinate system as our basemap.

 

partII_canadaInTableau.png

 

Now we get to the fun part...Tell a little white lie with your data.  We’ve seen above that Tableau will take a projected Shapefile and convert it into Web Mercator coordinates to match our basemap.  But, sometimes we don’t want Web Mercator.  So, here is where the fun comes in.  We make our data think that it’s in Web Mercator so that it will use that coordinate system when it draws.

 

DISCLAIMER
You want to be really careful here – you’re about to edit your Shapefile so that it has the wrong coordinate system listed for it.  Label this file carefully; you do not want to accidentally try to use this in analyses outside of Tableau.  We really are telling a white lie with data here and this is just a trick so that you can have a different ‘shape’ map in Tableau for visual analysis, it is not a pathway to success in spatial data analysis. 

 

Now that I’ve gotten that out of the way, we will continue with the process of little white data lies.

 

Open the properties for your newly projected file (right click on the name -> properties) and where it says that the Coordinate reference system is EPSG:3347 – NAD83 / Statistics Canada Lambert…go ahead and click Specify and just change that to WGS84 / Pseudo Mercator (EPSG: 3857).

 

partII_updatePrjToWM.png

 

What that did was update the text that defines the coordinate system for your dataset. It did not actually change the coordinate system, it just changed what the file thinks its coordinate system is.

 

Now you need to save a new copy of the Shapefile (right click -> Save As…) to make sure that the new definition is saved.  When I do this, I like to append some information in the file name to remind myself that this is a faux Web Mercator file.

 

Now what will happen if we add this new file to Tableau??

 

We have our newly projected data available in maps!

partII_prjCanadaInTableau_noBaseMap.png

Granted, you need to make the basemap transparent so that you don’t see that this is just a little data trick…

partII_prjCanadaInTableau.png

 

Why would we want to do this?  Check out a side-by-side comparison of the data in Web Mercator coordinates vs. Statistics Canada Lambert… The distortion on Web Mercator makes a big difference in the visual for this dataset!

Canada.png

 

 

Using this trick you should now be able to see and understand (and communicate!) using projected spatial data in Tableau.