2 Replies Latest reply on May 24, 2017 11:10 AM by Marcelo Beccaro

# correlation matrix 10.2

Hi,

I'm trying to do something similar to what is shown in https://www.tableau.com/new-features/10.2#tab-advanced-analytics-4

In my case, I have several topics (subject) for which students where graded, and trying to correlate this Grade to itself.

The end result should be something like Bora Beran did a few years ago in https://boraberan.wordpress.com/2013/12/09/creating-a-correlation-matrix-in-tableau-using-r-or-table-calculations/  (sadly his workbook cannot be opened in Tableau Public)

Whether the data is tabular or columnar, Tableau will only show values in the main diagonal of a n x n matrix, as expected, and going for domain completion (as in http://drawingwithnumbers.artisart.org/comparing-each-against-each-other-the-no-sql-cross-product/) pre-aggregates the data, so something like corr([Grade (Down)], [Grade (Across)]) would never work.

I don't know how to tell Tableau which "Grade" to use at each cell.

Even with the tabular example, which makes measurements a little more explicit, I can't find a elegant way to create a pill with corr(measured value from name in X, measured value from name in Y), which could solve the whole matrix - this might even be a too 'Excelee' way to think

Any thoughts ?

• ###### 1. Re: correlation matrix 10.2

Hi, Marcelo

the data behind is different from yours comparing to Bora's.

That's why it doesn't get the co-relationship of one subject to others.

ZZ

1 of 1 people found this helpful
• ###### 2. Re: correlation matrix 10.2

Hi Zhouyi,

Exactly, Beran's data was originally 32 x 11 values from which he makes a cross join in order to get 1:N data.

What I WAS trying to do was to follow on Jonathan Drummey 's footsteps (the Drawing with Numbers link above) to use only table calculations, like he himself did at the end of his post with Beran's original data, but now using the corr (or window_corr) from 10.2 instead of doing all the sigmas for correlation.

Just looking at your screenshot shown me that my approach would not work:

Pearson's formula is made of parcels available in table calc (or that can be assembled directly with table calculations)

(WINDOW_SUM(SIZE()*SUM([Measure1]*[Measure2]))-WINDOW_SUM(SUM([Measure2]))*WINDOW_SUM(SUM([Measure1]))) /

(SQRT(((WINDOW_SUM(SIZE()*SUM([Measure1]^2))-WINDOW_SUM(SUM([Measure1]))^2))*(WINDOW_SUM(SIZE()*SUM([Measure2]^2))-WINDOW_SUM(SUM([Measure2]))^2)))

(or something like that)

...while the corr function asks for the measure itself which would only be available in the panel with "analyis -> aggregate measures" turned off, which would make even less sense to work with in this case.

So while self join can work with corr() easily (which now I believe was the intented use), domain completion may not be the use for it, albeit it being more gentle with larger tables.