13 Replies Latest reply on Dec 18, 2011 1:52 PM by Richard Leeke

Showing point density on a scatterplot

Hi everyone, I've got what seems like a basic question, but I can't find information about it anywhere!

Let's say I have the following points plotted on a scatterplot:

(1,1)

(1,2)

(1,2)

(1,2)

(3,2)

(3,3)

(3,3)

(3,3)

(3,3)

(3,3)

(3,3)

(3,3)

(3,3)

When I ask Tableau to plot this (by using two measures), it puts one dot everywhere that there is a point.  So, there is one dot at (3,3) even though there are multiple records existing at this point.  I'd like to color-code this frequency, so you can look at the scatterplot and see the "hot" spots.

How in the world do I do this?!

PS, I'm a relatively new user, so please be gentle.  :-)

• 1. Re: Showing point density on a scatterplot

Drop [Number of Records] on the colour shelf.  That will then colour each mark according to how many instances of that pair of coordinates exist.  Sometimes it can be effective to drop that on the size shelf instead of (or as well as) the colour shelf.

[Number of Records] is actually just a calculated measure with a value of 1.  The default aggregation for that field is SUM(), so if there are N rows with a particular pair of coordinates the result will N (i.e. the sum of N lots of 1).

• 2. Re: Showing point density on a scatterplot

Thank you so much for you quick response, Richard!

I had tried that already, so I think I'm doing it wrong.  When I drag the [Number of records] to the size shelf, it gives me "1" as the only value.  I've attached a screenshot so you can see what I mean.  I also changed the labels around, allowing test to overlap - it allows you to see that the density exists, but it looks terrible.

So what am I doing wrong?

• 3. Re: Showing point density on a scatterplot

It needs to be SUM([Number of records]) rather than AVG([Number of Records]) on the colour shelf.  As the [Number of Records] field has a value of 1 for every row, the average over all rows is always going to be 1, whereas the sum is going to be the number you want.

Just click the drop-down on the right hand end of the [Number of Records] pill on the size shelf and select SUM() instead of AVG().

If it gave you AVG() when you just dragged [Number of Records] to the colour shelf I think you must have accidently changed the default aggregation for [Number of records].  You can check (and correct) that by right-clicking on [Number of Records] in the data window and selecting Field Properties->Aggregation.

• 4. Re: Showing point density on a scatterplot

I still get 1 as a result if I use SUM.

• 5. Re: Showing point density on a scatterplot

I didn't look at your screenshot carefully enough. The issue is that you have batch # on the view, so the SUM() is telling you how many instances there are for each batch. If you take batch# off it should ben right.

If you do want to include batch# in the view you can still do it - but it's too hard to explain with one finger on a phone!

• 6. Re: Showing point density on a scatterplot

You will need to remove the AVG () functions from rows and columns, too. I think.

• 7. Re: Showing point density on a scatterplot

Okay, I've removed everything you've suggested, but now I don't have any data!  "Batch #" was my only dimension on this graph, so when I pulled it off, all my data points disappeared.  I changed the columns and rows back to SUM (even though I really don't want my data added up in any way - why does Tableau default to this?)

I've attached this Tableau workbook this time, if that would be helpful.  And I wanted to say THANK YOU SO MUCH for helping me with this issues - I'm pretty thoroughly confused!

• 8. Re: Showing point density on a scatterplot

Okay, I unchecked "Aggregate Measures" in the analysis menu and it gave me all my data points back.  I don't understand what "Aggregate Measures" ARE, but that's probably a different issue.

Unfortunately, I'm still getting 1 as the Number of Records.

• 9. Re: Showing point density on a scatterplot

Here's another screenshot.

• 10. Re: Showing point density on a scatterplot

Unfortunately I think you forgot to attach your workbook to that earlier posting - that would really make it easier.  Screenshots help, but I'm still guessing a bit as to what your data structure is.  This thread is actually a great example of why attaching a sample workbook is so useful (not that it would have helped when I was on my phone).

If you do attach it, don't forget to save it as a packaged workbook (*.twbx) - otherwise it won't contain any data so won't be any use.

As I'm on a PC rather than a phone now I'll try to explain a bit more about what is going on here,  I've also put an example together using the Tableau Coffeee Sales sample data which hopefully makes it clearer.

The reason your last attempt is showing all 1s is that when you unchecked "Aggregate Measures", that told Tableau that you didn't want to calculate aggregates (SUM(), AVG() etc) for any of the measures on the view, so it is showing all individual values returned from the database.

That wasn't quite what I was suggesting you do in that last posting - I meant you to click on the individual pills on the rows and columns shelf and set those pills to be dimensions.  Having looked at your screenshots again, I'm now not quite sure if that was what you will want anyway - it really depends on your data and what you're trying to achieve.  I'll try to explain it with the Coffee Sales data.

The Coffee Sales data has 2 years worth of monthly sales data by Product, and State.  I've plotted various views of Profit against Sales, either broken down by Product, or overall.  There are over 4,000 rows of data.  Think of [Sales] as [Mash Level (Inches)], [Profit] as [OG (Plato)] and [Product] as [Batch #].  I'm not sure if that's a good match against your data because I don't know how many records you have per batch, but I think this explains some concepts.

I've created six sheets showing what you started with, what you have now, what I was trying to make sure you avoided and three possibilities of what you might want, but without knowing your data and what you are trying to achieve I can't be quite sure which.  I'll try to explain each sheet briefly.

[Your original]: This sheet has one mark per [Product], showing the overall average [Profit] and average [Sales] - i.e. averaged over all 24 months and however many States.  The mark colour is set to AVG([Number of Records]) which is always 1, because [Number of Records] is always 1.  The average of 1 is always 1.

[Everything disaggregated]: This is what you have now.  This has told Tableau not to aggregate anything, just plot a mark for each row.  Not what you want at all.

[All aggregated, no dimensions]: This is just the overall averages over all products.  is what you would have got if you'd taken [Batch #] off the Level of Detail shelf in your original - which is why I said you didn't want averages on the rows and columns shelves.

[By Product, Sales and Profit]: This has one mark for each distinct value of [Sales] and [Profit] for each [Product].  Right click on the obvious red dot and select View Data and you'll see there are 6 months for which Caffe Latte Nevada had identical [Sales] and [Profit] results.  This gives some immediate insight into the process used for fabricating the Coffee Sales data.  ;-)

This would show you how many time each pair of values occurred within each batch.  I doubt that's what you want, but it could be.

[By Sales and Profit over all Products]: Taking [Product] off the Level of Detail shelf we now see that Decaf Espresso in Iowa and Green Tea in New Hampshire also had identical Sales and Profit figures for several months.

This would show you what I suspect you really want: how many times each distinct pair of values occurred over all records in all batches.

[All aggregated by Product]: The final view just shows the original view of average [Profit] against average [Sales] for each [Product], but this time showing how many records there were for each [Product].

I don't think that's what you're asking for.

• 11. Re: Showing point density on a scatterplot

I got it!!!  Thank you so much!!  As you said, I needed to change the "pills" on the column and row shelves to "dimensions" rather than "measure."  I didn't even know you could do that!

Seriously, thank you so much!  I learned a lot from all the things you've written.  :-)

Happy holidays!!

• 12. Re: Showing point density on a scatterplot

A "packaged workbook" - that's it!  I didn't attach it earlier because I couldn't figure out how to save it in a format you could.  Anyway, here it is.  You were correct in your above analysis - I am in fact trying to show "how many times each distinct pair of values occured over all records in all batches."

I'm going to go try to make sense of this.  Thank you AGAIN for helping - you're a lifesaver!

• 13. Re: Showing point density on a scatterplot

I decided I'd confused enough yesterday so I'd better try to explain properly.

I'll leave you to try to figure it out for yourself from my example, I always find I learn better by trying it for myself. Sing out if you can't get it and I'll post your workbook back with what you want.