8 Replies Latest reply on Feb 28, 2013 10:47 AM by elias.fayad

    relational statistics

    Tim Harris

      I am new to Tableau, a big fan, and trying to understand the capabilities better – particularly around relational statistics. 

       

      I see through the Forum that it will provide results for a two-variable regression, and it appears the correlation can then be calculated by hand using those results. Is there any way to automatically display selected results from the regression, or to automatically generate the correlation coefficient?

       

      Does Tableau provide other multi-variable relational statistics – eg, chi-sq for dimensions?

       

      Appreciate any guidance here – or direction to the right place on the Tableau site.

        • 1. Re: relational statistics
          guest contributor

          I have been looking for the same, but other than simple regression, it does not provide any other statistics. You can work out complex SQL calculations for the same probably but nothing that is easy.

          • 2. Re: relational statistics
            Joe Mako

            Another option is custom table calculations. I believe most calculations you are looking for can be calculated in Tableau with understanding of how those calculations are performed. The downside is, if you have over 100,000 rows, the table calculations may be slow to process the advanced calculations.

             

            Another option that may work is to have your data source be a database that has those functions built-in, or the ability to add user defined functions, and use RAWSQL() pass-through functions to expose them to Tableau.

            • 3. Re: relational statistics
              Scott Tennican

              Tableau does not yet have any built-in multi-variate summary statistics or tests.

              Other than Pearson's correlation coefficient and a chi-squared test for independence of two categorical variables,

              which statistics features would would you like to see in Tableau first?

               

              You can compute the Pearson's correlation coefficient in a calculated field like this:

              WINDOW_SUM(

                  (SUM([x])-WINDOW_AVG(SUM([x]))) *

                  (SUM([y])-WINDOW_AVG(SUM([y]))) )

              /

              (  SQRT(WINDOW_SUM(SQUARE(SUM([x])-WINDOW_AVG(SUM([x]))))) *

                  SQRT(WINDOW_SUM(SQUARE(SUM([y])-WINDOW_AVG(SUM([y]))))) )

              • 4. Re: relational statistics

                I would really love to see multivariate analysis included. Any chance it's coming out with 7.0 or are we going to need to wait a while?

                • 5. Re: relational statistics
                  Scott Tennican

                  Well, there are many kinds of multivariate analysis (e.g. regression analysis) in Tableau many of which have improved in 7.0.

                  Are you asking about simple multi-variate summary statistics and hypothesis tests like the ones below?

                  This kind of thing currently requires table calcs like the one I pasted above.

                   

                  Pearson's linear correlation coefficient

                  Spearman's rank correlation coefficient

                  t-test or a Welch’s test for difference in means or proportions

                  Chi-sq test for independence of two categorical variables

                  Chi-sq test for a specified proportion of values of a categorical variable

                  Chi-sq test for homogeneity of two samples of a categorical variable

                  One-way ANOVA test of homogeneity of several means

                  • 6. Re: relational statistics
                    Hunain Kochra

                    Hello Scott,

                     

                    Thanks for this post. I have 2 variables(X & Y) in a dataset of 30,000+ records plotted on a scatter.

                     

                    Using the trend line I can identify if the association is positive or negative between A & B, however I am unable to indicate the actual correlation value on the same scatter plot.

                     

                    Is this possible? Thanks for any help.

                     

                    Hunain

                    • 7. Re: relational statistics
                      Xiaodong Han

                      Scott:

                       

                      the formula helps a lot.  Would you mind post the formulas for the other statistics you mentioned?

                       

                      Thanks,

                       

                      Xiaodong

                      • 8. Re: relational statistics
                        elias.fayad

                        Thanks for that, Scott.  Would you say that the manual Pearson's linear correlation coefficient calculation for 2 variables would need to be applied manually across the board if, say, I had 10 variables whose correlation I needed to explore?  Any thoughts on how I may be able to produce a cross-correlation 10x10 matrix, in Tableau 7 or 8?