
1. Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jonathan Drummey Aug 28, 2013 9:28 AM (in response to Jason Scarlett)Hi Jason,
I suspect that part of why your question didn't get answered is that (besides being complicated) you didn't post any sample data, see So Your Question Didn't Get Answered... for details.
Assuming that your data is a simple list of dates and you want to count the number of records, you have a set of measures in the columns. Tableau needs something (i.e. a dimension) to use to aggregate the measures over, so that means the date rollups need to be dimensions, but I'm not clear on exactly how an hourly/daily/weekly min/max count would work. Do you mean the minimum COUNT(Number of Records) for any hour in the data, in the day in the data, etc.? In that case, then you'd be need to be using something like WINDOW_MIN(COUNT(Number of Records)) with hour in the view to get the Min Count per hour, the same calc with Day in the view to get the Min Count per day, etc. You could generate a ton of measures for each combination of measure & date rollup, but that still doesn't get you the dimension that you'd need for layout, so one option would be to have 5 worksheets, one for each rollup level. That could get really slow, because Tableau would have to issue 5*N queries to the data source (where N is at least 1 and can be bigger depending on the filters involved).
There's a big scaffold or small scaffold solution that I can think of that would work better:
 The big scaffold would be UNION'ed data source where I'd have the dates & number of records aggregated 5 times, once for hourly, daily, etc. and a new "Date Rollup" dimension. Then you could have the aggregated date on the level of detail, the Date Rollup on Rows, and a calculation like IF FIRST()==0 THEN WINDOW_MIN(COUNT(Number of Records)) END with Compute Using on the aggregated date would get you the Minimum Count for each date rollup. Similar calcs would work for the other measures.
 The smaller scaffold would be a cross product of the 5 date rollups and every date at the finest level of granularity that you need, then you'd have 5 connections to your data source (one for each rollup), and use a combination of calculated fields and data blending to get you the results.
Hope this helps!
Jonathan

2. Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jason Scarlett Aug 29, 2013 8:50 AM (in response to Jonathan Drummey)Thanks.
I've attached my attempt using the Tableau Superstore data to show the issue I run into and how I used your suggestion for an adequate solution. Now that v8 has worksheet level filters I will be able to make this work in a dashboard once our server is upgraded from 7 to 8.
I still get confused with window calcs. The trick I was missing was the FIRST() = 0 bit. I always forget that an individual calculation can "filter" data out without removing the rest of the data from the view.
As for the scaffold methods, I suspected that there may be some data wrangling needed prior to Tableau but was trying to avoid it. Is there a way to use the data calculated in Tableau as a Data Source for Tableau? I could generate the results in the format you mentioned, and then feed it back to Tableau to do the min/max on. Maybe in version 9
Jason

minmaxbydategroupings.twbx.zip 947.3 KB


3. Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jonathan Drummey Aug 29, 2013 7:19 PM (in response to Jason Scarlett)1 of 1 people found this helpfulThe problem here is that you're trying to do an aggregate (the window min/max/avg) of an aggregate (the sum of number of records) at several different levels of detail, it's really asking Tableau (or any BI tool) to do a lot in a single view.
Though you can nest table calculations, Tableau does not let you use a worksheet as a data source (there's an Idea for that). A workaround is to manually copy and paste data back in, or export data to an Access .mdb and bring that back into Tableau.
I looked at your calc and I don't think it's giving accurate results. You're doing a datediff of the min and max order date per month, which is going to give a number that is 1 less than the actual number of days involved (datediff counts the difference, not the days, see http://betterexplained.com/articles/learninghowtocountavoidingthefencepostproblem/ for a description). That is not necessarily the number of days there were actually orders in the month, nor is it the number of days actually in the month. There are solutions for both, which of those are you trying to do?
Jonathan

4. Re: Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jason Scarlett Aug 30, 2013 7:48 AM (in response to Jonathan Drummey)The number of days in the month, is what I am after.
There are four complications that I can see beside the fencepost issue you caught:
 Dates with no orders in the middle of date window. A distinct count on order date would highlight this (maybe I would call this the kicked in fenceboard problem ).
 Dates missing at end of a complete month  my original min/max approach (with a +1 fencepost fix) would fail here because of the max failing to reach the last day in the month.
 Dates missing at end of an incomplete month (current month)  my original min/max approach (with a +1 fencepost fix) would work here.
 Different counts across multiple years  Feb 2012 and Feb 2013 have 29/28 days respectively. If the year is not shown how would Tableau know to divide by 28 in one case and 29 is the other? My original solution would use 29 days for Feb in both years (wrong).
As you can see in the attached sheet, I've added some various ways to calculate the number of days in the month. None of them seem to work properly. I suspect using the "datetrunc" function with the a "dateadd" funciton with a condition liek "current period" would get close, but still doesn't solve #4.
Jason

minmaxbydategroupings2.twbx.zip 935.5 KB

5. Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jonathan Drummey Aug 30, 2013 8:16 AM (in response to Jason Scarlett)1 of 1 people found this helpfulI’m confused w/r/t complication #1: do you want dates with no orders to be counted for the average or not? Either way, the Min/Max solution won’t work because of the complications you mentioned, plus the additional complication that it will miss days in any month where the first order comes in after the first of the month.
If you want to always count every day in the month, these two calculated fields would work, and they work fine for 2012 and other leap years:
Days in Month (make this into a Dimension):
DAY(DATEADD(‘month’,1,DATETRUNC(‘month’,[Order Date]))1)
Adjusted Days in Month:
IF DATETRUNC(‘month’,[Order Date]) == DATETRUNC(‘month’,TODAY()) THEN
DAY(TODAY())
ELSE
END
I set this up in the attached.
If you want to only count the days (or hours, or weeks) that there were orders, then it gets a wee bit more complicated because we have to know what the level of detail will be in the view (in other words, what are the dimension pills and their granularity). For example, looking at the workbook you’d posted, the reason why COUNTD(DATETRUNC(‘day’,[Order Date])) is returning 1 is because DAY(Order Date) is a dimension in the view, in other words, there really is only one day for each order date. If you take DAY(Order Date) out of the view, you’ll see that the COUNTD() works. However, I’m not clear on whether your final view would need the DAY(Order Date) in it in order to actually work. If you did need DAY(Order Date), then I’d use TOTAL(COUNTD(DATETRUNC(‘day’,[Order Date]))) with an Advanced Compute Using on just the Day of Order Date, that would provide an accurate count that you could use in other calcs.
Jonathan

6. Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jason Scarlett Aug 30, 2013 12:21 PM (in response to Jonathan Drummey)#1 was a red herring.
I have fought with this one all day and can't seem to get the AVG and MAX/MIN to work on the same worksheet.
The Min/Max are easy enough since I don't need to know the number of days in a month to find the min/max daily count.
if first() = 0 then window_max(count([Number of Records])) end
// use compute by ORDER_DT and add level of detail as YEAR,MONTH,DAY
The AVG however I have issues with. You'll see my attempts on the last worksheet attached.
Jason
ps. I tried to clean out all the other rubbish I tried.

7. Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jason Scarlett Sep 3, 2013 9:42 AM (in response to Jonathan Drummey)Attached seems to work but the Level of Detail pill is used for the rollup level. It seems to be a simpler solution (for my brain anyways) than nesting table calculations.
It does not work for a partial rollup level (i.e. if I want a monthly average and the current partial month is included it will weigh the current month the same as the others). I can deal with this case by excluding the current period.
Thanks
Jason

8. Re: Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jonathan Drummey Sep 3, 2013 10:37 AM (in response to Jason Scarlett)Hi Jason, I'm pretty sure why you are seeing different results for the MIN(Adjusted Days in Month) calc is that the MY(Order Date) filter has September 2012 and September 2013, so the current partial month is (as of 3 Sep 2013) showing 30 for Sep 2012 and 3 for Sep 2013.
I think you can get what you want in Tableau, however I'm still not clear on whether you want dates with no orders to be counted for the average or not?
Jonathan

9. Re: Re: Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jason Scarlett Sep 3, 2013 12:18 PM (in response to Jonathan Drummey)Dates in the past with no order should be considered as a zero.
Dates in the future with or without an order should be removed (as per previous solution using adjusted days in months).
I've filtered out 2012 Sep 05 to demonstrate the missing order date.
Count: 2012 Sep 0130 (minus Sept 05) = 316
Count: 2012 Sep 0103 (to current date) = 57
Daily average = 11.30 (= 373/33, not 373/32=11.66)
You're right about the partial month (Sept 1st3rd).
... After a long discussion with a colleague we are inclined to think that month 'counts' are less than ideal because of the differing days in a month (applies to quarters too). We instead are going to go with a daily average that is multiplied by 30.4 (=365/12) to remove any bias. I know one of my pet peeves are the monthly control charts used in my organization that don't correct for this very issue.
Back to the missing dates ... I can't seem to figure out how to mix the window calculation for the numerator with the denominator. I may be overcomplicating things?
Jason

10. Re: Re: Re: Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jonathan Drummey Sep 3, 2013 3:11 PM (in response to Jason Scarlett)See the attached. What I did was use the Adjusted # of Days calc that I'd previously created that counts the number of days in each month. Since you'd said you want to count the days that don't exist in a month as 0 (and based on the calculations you put in your last post), I presumed that meant you'd want to be including those in the denominator.
Because the Adjusted # of Days calculation computes a result for every day, I set up a table calculation to only compute a result for the first day in each month. That's the Total # of Days for Month calc, and it has an advanced Compute Using of MDY(Order Date) so it partitions on each Month. That is nested within the Avg # of Orders calc, which sums up all the orders and the Total # of Days for Month calc, and the result is the 11.30.
Is this what you are looking for?
Jonathan

11. Re: Re: Re: Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jonathan Drummey Sep 3, 2013 3:13 PM (in response to Jonathan Drummey)One more thought, since it's the day after Labor Day and Labor Day was on the 2nd of September this year…if the beginning of the month has *no* orders, then you won't see any rows for September in the view. If that is a situation that could happen in your data, and you need to show 0, you'd have to do some padding of your data.
Jonathan

12. Re: Re: Re: Re: Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jason Scarlett Sep 4, 2013 6:06 AM (in response to Jonathan Drummey)This does what I want, but I have two questions:
 For the Total # of Days for Month calc: [IF FIRST()==0 THEN MIN([Adjusted Days in Month]) END], you have set the partition by to year/month only. I understand this is so the FIRST() only returns data on the first day of the partition which happens to be the first of the month. So far so good. But then in the Avg # of Orders calc the partition appears to be on both the year/month and the year/month/day. How does Tableau know to use only year/month for the "WINDOW_SUM([Total # of Days for Month])" denominator part (which has its own FIRST()) and to use both partitions for the FIRST()" part of the main calc? Are there two different partitions being applied?
 When I move the date pills to the details shelf I still get the right results. But, if I change the Order By on the Avg # of Orders calc to "Order Date" it gives the wrong answer. If I then manually change the calculate using back to what was previously there (edit_table calc >> compute using >> advanced >> month/day/year AND month/year on the partition) I get the wrong results. No matter how I change it now, I get the wrong results.I have to delete it and readd it to the worksheet to get it to work ... which makes me think the true partitioning levels (as above) are not bieng shown.
Null start of the month orders is an acceptable 'exception'.
Jason

minmaxbydategroupings6.twbx.zip 976.1 KB

13. Re: Re: Re: Re: Re: Re: Re: Min/Avg/Med/Max/CoV in a single table with different date rollups
Jonathan Drummey Sep 4, 2013 11:22 AM (in response to Jason Scarlett)If I'm reading you correctly, #1 and #2 are both facets of the same feature. Tableau can nest table calculations, each table calculation in the nesting can have its own independent addressing, sorting, and partitioning.
Total # of Days for Month has an Advanced Compute Using (addressing) of the MDY(Order Date):
This calc is included in the Avg # of Orders calc, so when you are editing using the Edit Table Calculation dialog gets an extra dropdown to choose each calc so you can separately set the addressing & partitioning:
Alternative explanation: In the Avg # of Orders calc, the FIRST(), TOTAL(), and WINDOW_SUM() table calcs all receive the addressing setting of the Avg # of Orders calc, while the inner [Total # of Days for Month] is a separate table calculation that can have its own compute using.
So, to answer your question #1, the answer is yes, there are separate addressing and partitioning for each calc.
For #2, I assume that you mean changing the "Compute Using" for the Avg # of Orders calc and not "Order By", there's not an Order By currently in the interface that I know of. This is where it gets a little tricky... If we use the context menu for the Avg # of Orders pill, we see this:
Nothing is selected. There are two possible reasons for this: Either there is a nested table calculation which has a different addressing than the Avg # of Orders (which is the case here), or the Avg # of Orders is using an Advanced Compute Using. If you choose Order Date from this context menu, that sets the Compute Using for *both* the Avg # of Orders and the nested Total # of Days for Month calc. This causes the Total # of Days for Month calc to feed incorrect results into the Avg # of Orders Calc and your results go haywire.
Does that answer your questions?
Also, I've attached a revised workbook that uses a table calc filter (for nonNull values) and Measure Names/Values to generate the view.
Jonathan

minmaxbydategroupings7.twbx.zip 992.9 KB
