1 2 3 33 Replies Latest reply on Jul 9, 2015 10:04 AM by . Tableautester Go to original post
• ###### 15. Re: Hexbins for visualizing million+ data points

It may look close - but I've realised my approach to binning needs to go back to the drawing board.

I'll pick away at it for a while, I'm still pretty sure it's doable...

• ###### 16. Re: Hexbins for visualizing million+ data points

Posting to cheer you on Richard, this is very cool!

• ###### 17. Re: Hexbins for visualizing million+ data points

I've got it working now. Will write something up and post a sample when I get a bit of time. Here are a couple of examples at different resolutions to give an idea of how it looks.

• ###### 18. Re: Hexbins for visualizing million+ data points

Well it was a bit of fun working out how to do this - and I think the result looks quite effective - but it's quite a lot of work and quite fiddly to set up, so I'm really not convinced I'll ever use this in practice.

Nonetheless, I promised I'd explain how it works so I started out to put a simple example together. But that really highlighted how much effort is needed to copy all the calculated fields into another workbook and hook them up to get them working (more on that below) - so I'll make do with just posting a copy of the workbook I used for the the screenshots above and some notes on how it works. I'm afraid this means there's a bit of extraneous noise in the workbook from where I was experimenting with stuff while I got it working.

The workbook includes examples of both approaches I talked about: drawing the hexagons as polygons and also using a custom hexagon shape.

The bulk of the work is the same in both cases: working out which data points fall into which hexagonal bin. Here's an outline of how it works.

1) Identify the x and y coordinates to be analysed. In the example workbook. The original data has high resolution timestamps, but I've converted these into seconds for the x axis and milliseconds for the y axis, prior to binning (it's much easier to work in numbers that datetimes).

2) Determine the size of the hexagonal bins by scaling the x and y coordinates so that the scaled values represent the length of a side of a hexagon. The parameters [X Scale] and [Y Scale] allow experimenting with different scale factors.

The scaled x and y coordinates are [x_scaled] and [y_scaled] in the workbook. In this case I've set it up so that [x_scaled] can be varied between 1 and 20 seconds and [y_scaled] can be varied between 0.1 ms and 2 ms. Those just happen to be the ranges within which the behaviour I was analysing manifested itself.

3) Work out which bin each data point falls into, as follows. Refer to the "hexagon binning" PDF linked to by Julius above for a fuller description of how this works.)

Consider an imaginary grid of hexagonal bins, interlocking as shown:

Note how there are two sets of rows of hexagons, with alternate rows offset by half a hexagon.

Take the positions of the centres of each of the blue hexagons and use these to form one rectangular grid, and do the same with orange hexagons to form another rectangular grid. In the example workbook, the grids are represented by [grid_1_bin_x], [grid_1_bin_y] and [grid_2_bin_x], [grid_2_bin_y].

For each data point, find which cell of each of the two rectangular grids it falls into, as shown below by the two rectangles enclosing the sample data point (the red dot).

Find the closest corner of each rectangle to the data point (i.e. the corner of the blue rectangle that is inside the orange rectangle, and vice versa). The closest corners are given by [grid_1_nearest_x], [grid_1_nearest_y] and [grid_2_nearest_x], [grid_2_nearest_y] in the workbook.

Finally work out which of these two "nearest corners" is closer to the data point in question. In the example above, it is clearly the corner of the blue rectangle. (Working out which one is closest just needs pythagorus - see there was a point in learning that at school after all.)

The chosen corner is the centre of the hexagonal bin that this data point belongs in, given by [hex_bin_x], [hex_bin_y] in the workbook.

4) Having established the bin, the actual hexagons can be drawn either with a custom hexagonal shape, or by drawing polygons.

The shape approach is the easiest - though it does require creation of a custom shape file of a regular hexagon with a transparent background. I'd never done that before, but I hunted out an old forum posting which gave some hints and suggested using Paint.NET. I've attached the (rather shoddy) PNG file I created - I'm sure Shawn will produce a better one if you ask nicely is anyone really wants to use this approach.

Having put the custom shape in your repository, simply put [hex_bin_x] and [hex_bin_y] on columns and rows, SUM([Number of Records]) on colour plus whatever other dimensions you want. (Note that I converted x back to a datetime in the sample workbook.)

Then fiddle with the scaling parameters and the size shelf slider and drag the axes around until everything lines up nicely. This is definitely a bit clunky.

5) The polygon approach requires seven points for each hexagon: one for each corner, with the last point repeating the first to form a closed polygon.

This is achieved by taking the cross-product of the table containing the data to be analysed with a table which just has a single column, representing the vertices of a hexagon, numbered from 0 to 6.

This results in 7 copies of the data from the original data source, which can bloat the workbook rather. The attached example went up from 230,000 to 1.6 million rows.

Finally calculate the positions of the corners of the hexagons: [hex_bin_x_vertex] and [hex_bin_y_vertex] in the workbook and draw the polygons, using [vertex] on the path shelf to define the drawing order.

The workbook also has a few other parameters and calculated fields which I haven't mentioned yet. [X Round] and [Y Round] were for trying to figure out what was going on with a rounding issue which was stopping filters from working properly and [hex_bin_x_int] and [hex_bin_y_int] were what I ended up with to solve that issue. Those are there to support the filter action which allows drill down from selected hex bins to a scatter chart showing the underlying data points.

Here are a few samples with different settings for the scaling parameters and different choices of colour scheme, step settings, etc.

Finally an aside on that comment at the start about how hard it is to copy all the calculated fields across to another workbook and get it all working. Here's what I got when I tried to copy all the calculations into a simpler data source:

The reason for all the red exclamation marks is that all the references between the calculations have turned into references to the internal names in the source sheet, which don’t exist in the target sheet, so the calculations now look like this:

This behaviour really inhibits the ability to re-use calculations effectively. Some method of sharing (or copying) calculations between workbooks has long been right up there at the top of my personal Tableau wish list. The ideal for me would be some form of User Defined Functions (UDFs) - but even fixing this issue so that you can copy and paste between workbooks would be a really big win. Version 9.0 maybe?

Anyway, I'd be interested in feedback from anyone brave enough to try to pick their way through all of this and actually put hexbins to work. Post some examples if you get it going.

Enjoy.

• ###### 19. Re: Hexbins for visualizing million+ data points

Hey Richard, just found this viz via Alberto Cairo in the LA Times: http://graphics.latimes.com/how-fast-is-lafd/#12/34.1358/-118.4010

You're cutting edge!

--Shawn

• ###### 20. Re: Hexbins for visualizing million+ data points

As a thank you to Richard for all his hard work on this (and letting me play along behind the scenes) I've attached a hexagon shape that should display a bit better than Richard's original. Couple of notes:

I always start by searching Google Images. To find this one I searched on "hexagon shape png". The best one (size, anti-aliasing, least noise,etc) turned out to be a hollow one. So to get it to work in Richard's viz I brought it into a graphics program (Painter 12) and filled the inside with the same black (you got to love the eyedropper!)

Then when I saved it back out the alpha channel is only outside the hexagon. FYI: the Paint program that comes with Windows doesn't handle anti-aliasing all that well, and if you use it for this you'll be doing a lot of pixel-painting inside the hexagon.

--Shawn

• ###### 21. Re: Hexbins for visualizing million+ data points

Thanks for posting that Shawn. The only trouble is, it needs to be the other way up to work with my calculations - so we either need to rotate the image by 90 degrees or transpose all the x's and y's in the calculations.

As an aside - do you know any good primer on all this anti-aliasing, alpha-channel, transparency stuff? I sort-of vaguely understand what anti-aliasing is about - but when it actually came to trying to make a shape I realised I didn't understand the detail - so I just bumbled my way to the point that the surrounding background was transparent (painting each pixel by hand).

• ###### 22. Re: Hexbins for visualizing million+ data points

Well rotating it was easy in Paint.NET, so I've attached the rotated version. Here's a close-up of the view using this shape - much cleaner edges - though the alignment of the hexagons doesn't look quite right (alternate gaps are bigger and smaller) - I wonder if the shape is completely symmetrical and scaled correctly. I was very careful in  my hand drawn one to get the shape centred and the correct proportions - but it's much harder to see exactly with more pixels.

Message was edited by: Richard Leeke Put the missing image back in again...

• ###### 23. Re: Hexbins for visualizing million+ data points

Yeah, I see what you're talking about with the asymmetry. This is probably one of those times when you get what you pay for (free). So later today I'll take the time to build one from scratch for you, and make sure it's in the right orientation. Also I'll look around for a primer (or write one up myself).

Cheers,

--Shawn

• ###### 24. Re: Hexbins for visualizing million+ data points

Thanks in advance for both of those things.

If you do make one from scratch, make sure you get the dimensions spot on. If the length of a side of the hexagon is 1 unit, the distant from tip to tip is exactly 2 units and the distance between the parallel sides is the square-root of 3 (1.732.....).

• ###### 25. Re: Hexbins for visualizing million+ data points

Did not see this topic covered in the Ideas section, so I added a post linking to this thread:

http://community.tableau.com/ideas/2254

• ###### 26. Re: Hexbins for visualizing million+ data points

As the row limit on Tableau Public has been raised to 1 million I thought I'd try that out by publishing (a slice) of the workbook I used in this thread.

I couldn't get it to display as an embedded viz or accept a "viz link", but here's a link to it on Public. Not really sure what the difference between "Insert Link" and "Insert Viz Link" is - it seems to try to validate the viz link. Maybe the format of links on Public has changed.

Edit: Just noticed that the link I inserted above does something really weird and then redirects back to a new window with another copy of this thread. Let's try just embedding the full URL: Workbook: Hex Bins. Well the forums did something different with that - let's see if that works...

... Nope.  Any idea what's happeningDustin Smith???

• ###### 27. Re: Hexbins for visualizing million+ data points

Hi Richard,

The links you posted to the Hex Bins workbook on Tableau Public are working fine for me.  The experience you're seeing might be the light boxing effect where Tableau Public picks up on where the views have been embedded.  This means that click on the a part of the page that isn't the viz will route you to that embed page (in that case this thread).

Going to try and embed the Hex Bin viz live in this post below this text:

Edit: The viz is showing up for me.  Richard are you able to see it embedded live in the thread as well?

• ###### 28. Re: Hexbins for visualizing million+ data points

You obviously have the magic touch, Dustin.

When I tried embedding I just got a completely empty space except for the "Learn About Tableau" text at the bottom.

The links I inserted are also working - though it pops up a funny shaped modal window that doesn't really fit the viz. Last night it was popping that up, starting to draw the viz and then redirecting to a new browser window with another copy of this thread. I'm sure I didn't click outside the window...

• ###### 29. Re: Hexbins for visualizing million+ data points

Hi All,

In Tableau 9 there are now native hexbin functions that make this kind of visualisation much easier. See the following posts on my blog:

I’m Too Hexy For This Viz… | The Last Data Bender

I’m a Hex Machine… | The Last Data Bender

Cheers,

Alan

1 2 3