-
1. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?
Don Wise Nov 11, 2018 8:35 AM (in response to Suvojit Basu)Hello Suvojit,
The difference in trend lines is due to whether the pill is being used as either Discrete or Continuous. A Discrete use of the pill will give you Headers for finite values vs Continuous which is a range of infinite values.
A good explanation can be found on Interworks site: https://interworks.com/blog/ccapitula/2014/10/28/tableau-essentials-chart-types-line-charts-continuous-discrete or here: https://www.interworks.com/blog/mtreadwell/2014/02/19/tableau-pills-continuous-and-discrete-data-roles
Ultimately, you'll need to decide which use case scenario meets your needs for your particular business case, say perhaps whether you have seasonal spikes in your data and need to determine whether the trend is affected/disaffected by those spikes in the data.
Hope this helps answer your question. If yes, please mark this response as correct. Thx, Don
-
2. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?
Suvojit Basu Nov 11, 2018 9:05 AM (in response to Don Wise)Hi Don,
Thanks very much for the prompt response. I understand the difference between discrete vs. continuous dates. And I agree with you on having to decide which to use given the specific business case. However, let's say a question on trend line, specifically p-value and R-squared value is asked in a test/interview scenario (Tableau certification exam for example). Will discrete date or continuous date provide the correct answer?
Thanks.
Suvojit
-
3. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?
Don Wise Nov 11, 2018 9:36 AM (in response to Suvojit Basu)1 of 1 people found this helpfulHi Suvojit,
As excerpted from What is the relationship between R-squared and p-value in a regression?
You'll need to check the values for both to see whether it fits the modeling scenarios. R-square value tells you how much variation is explained by your model. So 0.1 R-square means that your model explains 10% of variation within the data. The greater R-square the better the model.
Whereas p-value tells you about the F statistic hypothesis testing of the "fit of the intercept-only model and your model are equal". So if the p-value is less than the significance level (usually 0.05) then your model fits the data well.
Thus you have four scenarios:
1) low R-square and low p-value (p-value <= 0.05) = means that your model doesn't explain much of variation of the data but it is significant (better than not having a model)
2) low R-square and high p-value (p-value > 0.05) = means that your model doesn't explain much of variation of the data and it is not significant (worst scenario)
3) high R-square and low p-value = means your model explains a lot of variation within the data and is significant (best scenario)
4) high R-square and high p-value = means that your model explains a lot of variation within the data but is not significant (model is worthless)
What do your trend line values tell you?
Hope that helps! Thx, Don
-
4. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?
Suvojit Basu Nov 11, 2018 11:22 AM (in response to Don Wise)Hi Don,
I couldn’t have got a better explanation. Thanks very much for taking the time for such a detailed response. Sincerely appreciate it.
Best,
Suvojit
-
5. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?
Don Wise Nov 11, 2018 11:23 AM (in response to Suvojit Basu)Glad to have helped, please close this thread by marking as helpful or correct so that others may find it useful in the future. Thx, Don
-
6. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?
Dan CoryNov 12, 2018 5:10 PM (in response to Suvojit Basu)
1 of 1 people found this helpfulSuvojit -
The discrete (date part) version treats the months as equally spaced, while the continuous (date trunc) version treats the months as spaced according to the calendar. Since months aren't quite equal lengths, this leads to slightly different behavior. The correct answer would really depend on the data you were analyzing.
Dan
-
7. Re: Trend line R-squared and P-values different when date is discrete vs. continuous. Which is correct?
Suvojit Basu Nov 23, 2018 6:42 AM (in response to Dan Cory)Thanks Dan... I get it now. Appreciate your help!