Covariance, Trend Lines, Correlation Coefficient R and R-Squared

Version 8


    These calculations provide the basic correlation values in calculated fields. These include

    • covariance
    • slope of trend line, using on cov(x,y) / var(x)
    • Pearson's correlation coefficient, "R", which is cov(x,y) / (stdev(x) *stdev(y)
    • R-Squared


    Tableau returns these values when you add a trend line and then select Describe Trend Model. While handy for ad-hoc analysis, this functionality has a few drawbacks:

    1. You can't display the values in the view using, for example, a tooltip or title.
    2. You can't use the values inside another calculated field.
    3. When data is segmented by a categorical variable, Tableau treats this dimension as part of the trend model. In some case you may want to calculate the values independently for each segment.


    Showing Trend Line and R-Squared in Title


    Calculating Trend Line and R-Squared for Each Department



    There are several different ways we can do these calculations in Tableau:


    * Using COVAR() and CORR() - These regular aggregate functions were introduced in version 10.2, they may not be available for all data sources. COVAR() and CORR() can also be embedded in Level of Detail (LOD) expressions.

    * Using WINDOW_COVAR() and WINDOW_CORR() - These table calculations were introduced in 10.2 and are available in all data sources.

    * Using table calculations such as WINDOW_AVG() and WINDOW_STDEV() as primitives to create a formula that computes the results. This is necessary for versions prior to Tableau v10.2.

    * Using Tableau's R integration.

    * Using Tableau's Python integration - This was first available in version 10.1 and is not covered here.



    Example Calculation:


    Covariance = cov(x, y): 

    See Covariance - Wikipedia, the free encyclopedia



    Covariance as a regular aggregate (requires Tableau v10.2 or higher):

    //cov(profit, sales) as a regular aggregate



    Covariance as a table calculation (requires Tableau v10.2 or higher):

    //cov(profit, sales); set addressing / compute using to correct dimension



    Covariance built from table calculation primitives:

    //cov(profit, sales); set addressing / compute using to correct dimension



        (SUM([Profit]) - WINDOW_AVG(SUM([Profit]))) * 

        (SUM([Sales])  - WINDOW_AVG(SUM([Sales])))



    Trend Line Slope = cov(x,y) / var(x)



    Slope as a regular aggregate:

    //slope of trend line = cov(x,y) / var(x)

    COVAR([Profit]), [Sales]) / VAR([Profit])


    Slope as a table calculation requires Tableau v10.2 or higher):

    //slope of trend line = cov(x,y) / var(x)

    WINDOW_COVAR(SUM([Profit]), SUM([Sales])) / WINDOW_VAR(SUM([Profit]))


    Slope built from table calculation primitives:

    //slope of trend line = cov(x,y) / var(x)

    [cov(profit, sales)] / WINDOW_VAR(SUM([Profit]))


    Trend Line Intercept = mean(y) - slope(x, y) * mean(x)

    Same link as above:



    Intercept as a regular aggregate:

    //intercept of trend line = mean(y) - slope*mean(x)

    AVG(SUM([Sales]) - COVAR([Profit]), [Sales]) / VAR([Profit]) * AVG([Profit])


    Intercept as a table calculation:

    //intercept of trend line = mean(y) - slope*mean(x)

    WINDOW_AVG(SUM([Sales])) - WINDOW_COVAR(SUM([Profit]), SUM([Sales])) / WINDOW_VAR(SUM([Profit])) * WINDOW_AVG(SUM([Profit]))


    Intercept built from table calculation primitives:

    //intercept of trend line = mean(y) - slope*mean(x)

    WINDOW_AVG(SUM([Sales])) - [trend slope (profit, sales)] * WINDOW_AVG(SUM([Profit]))


    Pearson's correlation coefficient, "R" = cov(x, y) / (stdev(x)*stdev(y))




    R as regular aggregate:

    CORR([Profit], [Sales])


    R as a table calculation:



    R from table calculation primitives:


    [cov(profit, sales)] /






    Partitioning and Addressing:

    Set the addressing (compute using) to the dimensions that are driving the visible detail. In the example above, after adding the calculation to the view, right-click the pill and set Compute Using to Category.



    Related Functions:

    If you're running R, you can use the following formula to call R's cov and cor functions from Tableau:

    SCRIPT_REAL("cov(.arg1, .arg2)", SUM([Sales]), SUM([Profit]))

    SCRIPT_REAL("cor(.arg1, .arg2)", SUM([Sales]), SUM([Profit]))


    Using Dates and DateTimes for x/y Values:

    If you're using dates or datetimes as the x or y (for example computing the slope of a trend over time) and a number as the other value then you'll need to convert the date field to a usable number. There are three parts to this:


    a) If the date is a dimension and the other field is a measure then you'll need to aggregate the date to also use it as a measure, for example using the MIN() or ATTR() aggregation.

    b) Make sure the date level and date part/date value (e.g. year, month & year, year/month/day/hour/minute, etc.) used in the calculation is the same as what is used in the view. Here's a link to an example from the comments below: Apr 29, 2017 12:23 PM.

    c) Convert the date or datetime into a number, use INT() for dates and FLOAT() for datetimes.


    For example the covariance calculation for x = a date dimension using a minute date value (so truncated to the minute) and y = myValue could be something like:


    WINDOW_COVAR(FLOAT(DATETRUNC('minute',MIN([myDate]))), SUM([myValue]))




    Related Documentation:



    • v4: 2015-09-08 -- rewrote calcs using covariance, added trend line calcs. Jim Wahl
    • 2016-09-1 -- added workbook rSquared_v4.twbx for Tableau v9.0.
    • 2017-03-21 -- updated for Tableau v10.2's new Correlation and Covariance functions and added rSquared v5 for 10.2.twbx Jonathan Drummey
    • 2017-04-29 -- added section on using Dates & DateTimes plus cleaned up some formatting.