
1. Re: Cohort Analysis  How to sum and average only valid values in table?
Keshia Rose Jul 9, 2014 11:50 AM (in response to Patrick Negri)Hi Patrick,
Could you attach a sample packaged workbook? That way it will be easier for people to be able to help.
Take care,
Keshia

2. Re: Cohort Analysis  How to sum and average only valid values in table?
Patrick Negri Jul 11, 2014 6:46 AM (in response to Keshia Rose)I am trying to generate sample data based on our data so we can work on a packaged workbook example.

3. Re: Cohort Analysis  How to sum and average only valid values in table?
Patrick Negri Jul 11, 2014 1:05 PM (in response to Patrick Negri)Ok. Created data for a sample worksheet.
Can any1 take a look?
Sample Cohort Workbook:
https://dl.dropboxusercontent.com/u/49868369/Tableau%20Sample%20Data/Sample%20Cohort%20Workbook.twb
Sample Data (Excel):
https://dl.dropboxusercontent.com/u/49868369/Tableau%20Sample%20Data/Subscribers%20Sample%20Data.xls

4. Re: Cohort Analysis  How to sum and average only valid values in table?
Patrick Negri Jul 13, 2014 5:30 PM (in response to Patrick Negri)Any tips guys?

6. Re: Cohort Analysis  How to sum and average only valid values in table?
Jim Wahl Jul 17, 2014 7:33 AM (in response to Patrick Negri)Hi Patrick,
Tableau calculates sub totals and grand total by removing dimension "pills" to the right of the dimension you're sub totaling. For grand total, all dimension pills on the rows or column shelf are removed.
In your worksheet, for example, column grand total is calculated by removing the discrete, blue MONTH(Date of Signup) pill from the rows shelf.
This of course causes problems for your cohort calculation, which has a monthbased denominator.
The solution, which is not at all obvious, is to add back the detail by putting a copy of the Date of Signup on the Detail shelf / button. This doesn't affect your normal calculations at all, since you haven't increased the level of detail for all of the regular calculations, but when Tableau calculates the grand total and removes the Date of Signup pill on the rows shelf, the Date of Signup (Copy) on the Detail shelf keeps the months in the calc.
Here are the basic steps.
1. Duplicate the dimension Date of Signup. Just rightclick the dimension in the left data pane > Duplicate.
2. Drag this new Date of Signup (Copy) to the Detail button. Set it to Month, same as the pill on the Rows shelf.
Again, this won't affect the regular rows, but you can now see the grand total row has multiple values, one for each month. (I've hidden ApriltoDec below.)
Since MONTH(Date of Signup) is on the Detail button and not on the Rows/Columns shelf, Tableau "stacks" the extra marks, which is why you see multiple values in the GT row. And why you see extra space on the other rows.
I'm going to walk through this stepbystep, although you could probably combine a couple of these.
3. Create calculated field for Start Val
One good way to learn the table calc formulas is to dissect the Tableaugenerated quick table calcs. For example, you can see how Tableau created % of Total in your worksheet by clicking the pill > Edit Table Calculation > Customize where you'll see:
COUNTD([Client ID]) / TOTAL(COUNTD([Client ID]))
I copied this last part TOTAL(COUNTD(...)) to a new calculated field called Start Val. This is a table calculation, because of TOTAL(), which allows you to specify a different level of aggregation for COUNTD(...).
Doubleclick this measure to add it to the worksheet. By default Tableau aggregates table calculations by "Table Across", which in this case is retention period. While that is what we want, I'll often make the partitioning/addressing explicit by clicking the pill > Compute Using > Retention. This helps me keep things straight and also allows me to move the pills around without affecting the addressing.
In the GT row, you can see we're getting closer. The total row gives you the numerator and the Start Val row has the denominator.
4. Sum values on GT row
The final part is to sum all of the values. That's done with another table calc function WINDOW_SUM(). Click the Start Val measure in the left Data pane > Create Calculated Field. And wrap Start Val in a WINDOW_SUM().
Start Val Total = WINDOW_SUM([Start Val])
Double click the measure to add it to the worksheet. Again, Tableau aggregates this as Table Across, which adds all the start vals over the retention period for each monthso you get your start val multiplied by the number of retention periods.
This is of course not what we want, so click the pill > Edit Table Calculation.
Nested Table Calc. In the Table Calc dialog box, you'll see a pulldown menu for Calculated Field. Because this calculated field refers to another field, Start Val, which has table calc functions, Tableau allows you to set the partitioning / addressing (aka compute using) independently for each calculated field with a table calc. This is also called a "nested table calc." Here we do want to set them differently. Select Start Val and set Compute Using > Retention. Select Start Val Total and set Compute Using > Date of Signup (copy)
Note that we're using the (Copy) version, because grand total removes Date of Signup from the view and the multiple values we're seeing are a result of (Copy), so to we want to WINDOW_SUM() over all the (Copy) row.
Also note that if we didn't use the Table Calc dial box, but set the Compute Using by right clicking the pill in the worksheet and using the Compute Using selection, Tableau would apply that addressing across all table calcs. In general, I like setting the addressing / partitioning explicitly and I like using the Table Calc dialog box.
Now we're getting the right value, but multiple, duplicate copies of it. One for each row in the (Copy) partition. Sometimes, such as when your graphing, you want this. In a table, you can remove the duplicates with another table calc. Click Start Val Total > Create Calculated Field >
Start Val Total First = IF FIRST() == 0 THEN [Start Val Total] END
Doubleclick to add to the view. Click the pill > Edit Table Calculation. We now have three table calcs: Start Val, Start Val Total, Start Val Total First, and each one can have its own addressing. Doublecheck that Start Val and Start Val Total are set as above to Retention and Date of Signup (Copy), respectively. Then set the Start Val Total First to Date of Signup (Copy) as well, since again we want just the first value in the GT box.
Now, finally, we have our denominator.
4. Calculate the numerator
On the Total row in the grand total pane, you have your numerator values, we just need to sum them over the Date of Signup (Copy) dimension. Create a new calculated field
Total v2 =
IF FIRST() == 0 THEN
WINDOW_SUM(COUNTD([Client ID]))
END
This is the same logic as above. I combined the FIRST and WINDOW_SUM() into one, since both are computed over Date of Signup (Copy).
Doubleclick to add this to the view. Set compute using to Date of Signup (copy).
Now you should have an accurate total for each period (Total V2) and a total starting value.
5. Calculate the % Retention
Relative V2 = [Total V2] / [Start Val Total First]
Since this uses table calcs that are already in the view, the partitioning/addressing will be inherited. But it's still a good practice to click the pill after adding it > Edit Table calculation and verify everything is setup properly.
6. Cleanup.
To get back to your original view. Just remove all but Total V2 and Relative V2 from the view.
To remove the white space, click from the top menu Analysis > Stack Marks > Off.
Edit the aliases to get clean Total and Relative names. I also hide the dimension column names (I don't need a column head called Month to tell men Jan, Feb, March are months, ...). Click the Header > Hide Field Labels for Rows.
See attached workbook. Let me know if you have any questions. Neat little problem. Thanks for posting.
Jim

Sample Cohort Workbook_jimw.twbx 161.9 KB


7. Re: Cohort Analysis  How to sum and average only valid values in table?
Jonathan Drummey Jul 17, 2014 2:30 AM (in response to Jim Wahl)Nice description!

8. Re: Cohort Analysis  How to sum and average only valid values in table?
Matt Lutton Jul 17, 2014 4:58 AM (in response to Jim Wahl)Amazing work, in my eyes.

9. Re: Cohort Analysis  How to sum and average only valid values in table?
Patrick Negri Jul 17, 2014 6:23 AM (in response to Jim Wahl)Awesome work. Thanks.

10. Re: Cohort Analysis  How to sum and average only valid values in table?
Jonathan Drummey Jul 17, 2014 6:26 AM (in response to Jonathan Drummey)Hi Jim,
I just reread this, and one thing jumped out at me. You'd written: "Tableau calculates sub totals and grand total by removing dimension "pills" to the left of what you're totaling." Isn't it it dimension pills to the right?
For example, if I have Department, Category, and Item (in that order) on Rows, and I put a subtotal on Category, then that subtotal is effectively removing Item.
Jonathan

11. Re: Cohort Analysis  How to sum and average only valid values in table?
Jim Wahl Jul 17, 2014 7:25 AM (in response to Jonathan Drummey)Yes, you're right. I'll edit the above for posterity.
I was looking the rows shelf and thinking, "the dimension pills to the left of the measures." But dimensions are always to the left of measures, so that doesn't really need to be said.

12. Re: Cohort Analysis  How to sum and average only valid values in table?
Jonathan Drummey Jul 21, 2014 3:39 AM (in response to Jim Wahl)Hi Jim,
I acknowledge that I'm getting very precise about language here, and I'm trying to reduce & prevent confusion. You wrote, "But dimensions are always to the left of measures, so that doesn't really need to be said." This is not the case, here's a demo using Superstore Sales:
On Rows and Columns, blue pills (discrete fields) are always to the left of green pills (continuous fields). In the above screenshot, the discrete SUM(Sales) measure is to the left of the green Sales dimension, so measures can be to the left of dimensions. This is kind of a tangent from the original topic, it's worth noting that Tableau won't generate a row Grand Total when there is a discrete measure on Rows, and won't generate a column Grand Total when there is a discrete measure on Columns.
Also, Tableau can generate totals when a continuous dimension is used, here's an example from Superstore Sales:
So, on further reflection, a more accurate statement would be "Tableau calculates subtotals for a given Shelf (Rows or Columns) by effectively removing discrete dimension pills to the right of the discrete dimension that you are subtotaling. Grand totals are calculated by effectively removing all discrete dimensions from that Shelf."
Does that make sense?
Jonathan

13. Re: Cohort Analysis  How to sum and average only valid values in table?
Keith Conner Jul 20, 2015 7:18 AM (in response to Jim Wahl)Jim, This is awesome.
Thank you!
Keith Conner