I have been looking for the same, but other than simple regression, it does not provide any other statistics. You can work out complex SQL calculations for the same probably but nothing that is easy.
Another option is custom table calculations. I believe most calculations you are looking for can be calculated in Tableau with understanding of how those calculations are performed. The downside is, if you have over 100,000 rows, the table calculations may be slow to process the advanced calculations.
Another option that may work is to have your data source be a database that has those functions built-in, or the ability to add user defined functions, and use RAWSQL() pass-through functions to expose them to Tableau.
Tableau does not yet have any built-in multi-variate summary statistics or tests.
Other than Pearson's correlation coefficient and a chi-squared test for independence of two categorical variables,
which statistics features would would you like to see in Tableau first?
You can compute the Pearson's correlation coefficient in a calculated field like this:
( SQRT(WINDOW_SUM(SQUARE(SUM([x])-WINDOW_AVG(SUM([x]))))) *
I would really love to see multivariate analysis included. Any chance it's coming out with 7.0 or are we going to need to wait a while?
Well, there are many kinds of multivariate analysis (e.g. regression analysis) in Tableau many of which have improved in 7.0.
Are you asking about simple multi-variate summary statistics and hypothesis tests like the ones below?
This kind of thing currently requires table calcs like the one I pasted above.
Pearson's linear correlation coefficient
Spearman's rank correlation coefficient
t-test or a Welch’s test for difference in means or proportions
Chi-sq test for independence of two categorical variables
Chi-sq test for a specified proportion of values of a categorical variable
Chi-sq test for homogeneity of two samples of a categorical variable
One-way ANOVA test of homogeneity of several means
Thanks for this post. I have 2 variables(X & Y) in a dataset of 30,000+ records plotted on a scatter.
Using the trend line I can identify if the association is positive or negative between A & B, however I am unable to indicate the actual correlation value on the same scatter plot.
Is this possible? Thanks for any help.
the formula helps a lot. Would you mind post the formulas for the other statistics you mentioned?
Thanks for that, Scott. Would you say that the manual Pearson's linear correlation coefficient calculation for 2 variables would need to be applied manually across the board if, say, I had 10 variables whose correlation I needed to explore? Any thoughts on how I may be able to produce a cross-correlation 10x10 matrix, in Tableau 7 or 8?