I’ve been working on some cool new demos for my TC17 talk with Alan Eldridge on working with dense datasets (“Masters of Hex: Interpreting Dense Data with Tableau”) and thought I’d share one fun example early – jittering point locations on maps. Alan was my creative inspiration for this, and I have to give him complete credit for a lot of the brilliance here, since I have based my work on his example from the Tableau Community forums several years ago.
Alan created a killer example to jitter points on a scatterplot in a circular pattern around the origin location. I’ve extended his example to show three ways of jittering points on a map:
All of these have an adjustable parameter for the maximum distance to use in jittering.
There is a partial workbook of examples up on Tableau Public (though the random() calculation doesn’t work with the extract, so I’ve attached a FULL workbook to this post so you can enjoy the random-ness of all of the examples...the workbook was made in Tableau 10.2).
Regular distribution of points around center
Our goal here is to set up calculated fields to tell us:
Is there more than one point at this location? If so, jitter. If not, use the actual location.
I’m assuming that you are starting out with a bunch of rows of data, you have latitude and longitude for point locations, and many of these points are at exactly the same location.
An easy way to see how many points you have overlapping is to drop Latitude and Longitude onto a viz, then use Number of Records on size to count everything up:
Our first step is going to be making sure that we have a way to quickly identify how many points we have at each location, and based on the total number of points figure out where to place them so that they are evenly spaced.
While the count could be calculated with an LOD calculation, like this:
it ends up being more flexible to use Index() to assign a sequential value for each record with the same latitude and longitude values.
First we’ll get an index value for each row:
When you use a table calculation like this, you need to be careful with the level of detail for the calculation.
When we use that index value, we need to make sure that it’s numbering our rows for each level of detail (pairs of lat/lon values where we have overlapping data points) correctly. I find it easier to sort this out when looking at the data in a table, versus in a map – because I can more quickly see the index values and assess if they are what I expect.
To check that my indexing is correct, I’ve made a table to show the index values for each set of overlapping points. Since my dataset is based on zip code centroids, I can make a table with the zip code and ID fields, and then drop index in as text.
If you only have latitude and longitude as your identifiers showing which points overlap, you can still create a combined field to get the same table - in the table below, I just combined the latitude and longitude fields to make a really long string with unique coordinate pairs.
The default index just counts up sequentially and doesn’t re-start the count for each group of points with the same location. To fix this, we can just change the Index table calculation to be based on ID and now it’ll re-start numbering appropriately. The example below shows two copies of the index – one is the default based on the Table (Down), and one is along ID. The index calculated along ID restarts the index numbering for each zip code:
If we have both an ID field and an attribute that we want to use in symbolizing the jittered points, we need to edit this so that we compute using both ID and Value:
Here are a few examples of different table calculations so that you can see the difference when we add in Value as a dimension on our viz:
Okay, we have index values now – and hopefully know how to set up the correct table calculation, but how can we use this information to distribute our points evenly around the center point like these examples??
There is a bit of math involved, but it mostly boils down to the fact that there are 360 degrees in a circle and simple division will tell us the number of degrees between each point. (10 points / 360° = 36° between each point; 3 points / 360° = 120° between each point).
First, we can calculate out the angle for each point like this:
And then we can make our new latitude and longitude calculated fields so that we know where to put each point on the map. It’s a bit more math, but not too tricky. We just need to know:
- Do we have more than one point at a location – if so, we need to jitter? (Point count - We can check how many points are at a location by looking at the max of our index)
- How far from the original location do we want it to be (Jitter distance – which I’ve set up as a parameter so that it’s easy to change) On a map, these numbers will be pretty small, because by default they are measuring in degrees. It would be possible to set these in meters, miles, kilometers, or another measure – but the calculation of the new jittered point location will be more complicated (there are examples of this in the attached workbook…look for Jitter latitude (# of miles) and Jitter longitude (# of miles)).
- Where do we place it around the original location? (Jitter Angle)
And now for the little bit of math to calculate where a point would fall on the circumference of a circle looks like this:
x = origin_x + radius * cos(angle)
y = origin_y + radius * sin(angle)
Or, in Tableau-world:
Since these calculated fields are giving us new latitude and longitude values, we need to set them each to the right geographic role type:
And now we can make our map.
I think the hardest part of this is just making sure that we have the table calculations compute correctly. This all boils down to whether or not we’re getting our index calculated right – which is why I generally start out by making a table where I can just print out the index values.
Drop your Jitter latitude and Jitter longitude measures onto a worksheet – it might be blank, or you might just get a single point at 0,0…either way, it probably isn’t what you want! Why? These are table calculations and you have no level of detail on the viz. So you need to fix that to get a useful viz.
Try dropping Zip Code on ‘Detail’ – now you should have a bunch of marks – we’re starting to get somewhere!
Tableau likes to aggregate things by default, so we need to add a dimension to disaggregate – we’ll add the ID to detail as well. The viz won’t change by default when you do this, because we still need to fix our table calculation so that the index is correct. I can check the index by adding it to detail and then selecting all of the marks at one location (just use a rectangular select) and then view the data. For the location that I selected, there were four marks, and the index for each of them is 1. Recall that clause in our jittering calculation that if we only have one point it won’t jitter…since the index never goes above 1, we never jitter.
To fix this up, we just need to update our table calculation for the Jitter latitude and Jitter longitude so that they are also computed using ID.
Note that if you have Index on detail or color or anything else on the Marks card (and you want it to match the index calculation for jittering), you’ll need to update the table calculation there as well.
And your map should update to show you the jittered points! Here are a few of them color encoded by Zip Code.
In the image above, I’ve color encoded based on Zip Code. What if I want to encode based on a different attribute, like the Value dimension in my example dataset? Things might get a little weird:
What happened? Our index values changed! Remember that table we made at the start of the post? Take a look at the difference in calculated index values when we have Zip Code, ID, and Value in our viz!
Index along ID only gives us a count of 1 to n for each unique combination of Zip Code and Value. That doesn’t help us here… we want our index to be consistent for each Zip Code (regardless of differences in Value). So, let’s add Value to the dimensions in our table calculations and all should be good. Right click on the Jitter latitude and longitude pills and Edit Table Calculation. For Jitter longitude and Jitter latitude, make sure you change all of the nested calculations as well!
And now you have a map with jittered points!
If you don’t like the fact that the points aren’t in a perfect circle on the map, check out the variant in the Tableau public demo where I do some math on the sphere…just because, well, it’s sort of fun to dredge up a bit of that high school trig…
But, I promised you three ways to jitter your points and I’ve only delivered one so far. So, let’s power through a few random jittering options.
Random distribution around center #1
These next two techniques are good if you need to mask or anonymize locations. With the regular distribution, it’s easy to tell where the original location for the points was – but sometimes you need to hide that original location…so, we’ll randomize.
Quick (but important) note: This will not work if the data is an extract…so it won’t upload nicely to Public. But, it works just fine if you save your workbook as a twbx (at least it did when I made the demo files in Tableau 10.2)
For this application, we don’t need to set an index because we aren’t arranging the points in a sequence; we just need random new latitude and longitude values that are within a specific range for each point. That translates to equations along these lines:
New Latitude = random number * (maximum latitude – minimum latitude) + minimum latitude
New Longitude = random number * (maximum longitude – minimum longitude) + minimum longitude
Or calculated fields like these:
Change those new calculated fields to have the right geographic roles and map away!
Random distribution around center #2
The last example of random jittering showed placement within a rectangular bounding box (+/- a set distance from the origin). What if you want to jitter within a circular bounding area? Easy peasy…we can just calculate out a random distance from the center at a random bearing and a little bit of math does all of the hard work for us! The important thing to know with this example is that all of the angles (latitude, longitude, and our random bearing) need to be in radians – and then the final result has to be converted back to degrees for Tableau to map it.
And that’s about all there is to it. A few different ways to jitter point locations on maps in Tableau. The attached workbook has all of the calculations, some notes, and a few additional examples to play with!
In the long run I’d like to add an option to jitter the data into a cool spirals of points so that when I have a lot of points I can distribute them to avoid any overlap. I had some fun with that idea today and made a spiral viz to show the prime numbers between 1 and 1000…but I’ll save the write-up on that (and how to get these spirals to show up on a map!) for my next long plane ride or another time when I just need some good, Tableau-fueled entertainment. Or else when someone posts a desperate plea on the Tableau forums for spiral jittered points.