7 Replies Latest reply on Nov 23, 2018 6:42 AM by Suvojit Basu

# Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?

Hello,

A question on trend line...The R-Squared and P-Values for a Trend Line (linear) on the same data set changes when the date pill is changed from discrete to continuous. Why is that? Which trend line is correct?

Below is an image detailing the question and also attached is a workbook.

• ###### 1. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?

Hello Suvojit,

The difference in trend lines is due to whether the pill is being used as either Discrete or Continuous.  A Discrete use of the pill will give you Headers for finite values vs Continuous which is a range of infinite values.

A good explanation can be found on Interworks site: https://interworks.com/blog/ccapitula/2014/10/28/tableau-essentials-chart-types-line-charts-continuous-discrete  or here: https://www.interworks.com/blog/mtreadwell/2014/02/19/tableau-pills-continuous-and-discrete-data-roles

Ultimately, you'll need to decide which use case scenario meets your needs for your particular business case, say perhaps whether you have seasonal spikes in your data and need to determine whether the trend is affected/disaffected by those spikes in the data.

Hope this helps answer your question. If yes, please mark this response as correct. Thx, Don

• ###### 2. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?

Hi Don,

Thanks very much for the prompt response. I understand the difference between discrete vs. continuous dates. And I agree with you on having to decide which to use given the specific business case. However, let's say a question on trend line, specifically p-value and R-squared value is asked in a test/interview scenario (Tableau certification exam for example). Will discrete date or continuous date provide the correct answer?

Thanks.

Suvojit

• ###### 3. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?

Hi Suvojit,

You'll need to check the values for both to see whether it fits the modeling scenarios. R-square value tells you how much variation is explained by your model. So 0.1 R-square means that your model explains 10% of variation within the data. The greater R-square the better the model.

Whereas p-value tells you about the F statistic hypothesis testing of the "fit of the intercept-only model and your model are equal". So if the p-value is less than the significance level (usually 0.05) then your model fits the data well.

Thus you have four scenarios:

1) low R-square and low p-value (p-value <= 0.05) = means that your model doesn't explain much of variation of the data but it is significant (better than not having a model)

2) low R-square and high p-value (p-value > 0.05) = means that your model doesn't explain much of variation of the data and it is not significant (worst scenario)

3) high R-square and low p-value = means your model explains a lot of variation within the data and is significant (best scenario)

4) high R-square and high p-value = means that your model explains a lot of variation within the data but is not significant (model is worthless)

What do your trend line values tell you?

Hope that helps!  Thx, Don

1 of 1 people found this helpful
• ###### 4. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?

Hi Don,

I couldn’t have got a better explanation. Thanks very much for taking the time for such a detailed response. Sincerely appreciate it.

Best,

Suvojit

• ###### 5. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?

Glad to have helped, please close this thread by marking as helpful or correct so that others may find it useful in the future. Thx, Don

• ###### 6. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?

Suvojit -

The discrete (date part) version treats the months as equally spaced, while the continuous (date trunc) version treats the months as spaced according to the calendar. Since months aren't quite equal lengths, this leads to slightly different behavior. The correct answer would really depend on the data you were analyzing.

Dan

1 of 1 people found this helpful
• ###### 7. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?

Thanks Dan... I get it now. Appreciate your help!