
1. Re: The age old Percentile question
Jonathan Drummey Oct 22, 2012 8:24 AM (in response to .George Prevelige)Hi George,
Since Tableau doesn't have a builtin PERCENTILE function, we have to use workarounds. Each workaround has different tradeoffs which may be unacceptable depending on the problem at hand.
For example, a simple blocking into four quantiles for a heat map/highlight table can be done using and INDEX() function with a stepped color, like this:
Or you can view a spread with circle, shape, or square marks and reference lines for quartiles or other 'tiles:
However, what you are looking to do is generate a percentile result for each row, which currently requires a table calculation. Table calculations are wonderfully powerful, and have some limitations in terms of what kinds of aggregations are easy to set up and which ones take more effort. Partitioning results by a table calculation like a percentile calc to get the total records in each of the top/bottom 5 percentiles falls into the more effort category, but it's doable. A prerequisite is being familiar with table calculations, particularly setting Advanced and nested Compute Using's.
Richard Leeke has done a huge amount of work on quantiles, I took a formula for calculating percentiles from this post: http://community.tableau.com/message/189200#189200. Here's the formula:
// The quantile rank gives the proportion of values the same as or less than the value in question // Note that the term subtracting half the number of records for the current value is effectively // returning the average rank in the case of multiple occurences of the same value. // See http://http://en.wikipedia.org/wiki/Percentile_rank. // Calc taken from http://community.tableau.com/message/189200#189200
IF (ATTR([Score]) = LOOKUP(ATTR([Score]), 1)) THEN PREVIOUS_VALUE(0) ELSE INT(100 * (RUNNING_SUM(SUM([Number of Records]))  ((SUM([Number of Records])  1)/ 2)) / TOTAL(SUM([Number of Records]))) ENDThe calculation needs an Advanced Compute Using set to the ID, sorted on SUM([Score]) Ascending. This gives you a result like this:
Then from that, we need to create set of measures that will return the number of records falling into each quantile range. Since we can't partition by table calcs, we need to generate our own total. Here's the Top 5th %ile Feeder calc:
IF [Percentile] >= 96 THEN PREVIOUS_VALUE(0)+1 ELSE PREVIOUS_VALUE(0) END
This calculation will have it's Compute Using set to ID, while the nested Compute Using for Percentile remains the same as above. This calc essentially does a running sum to find the number of records (ID rows):
Now, we only want the total of the top 5 percentiles, so the next step is to get the largest values of the feeder calc. And, since we want this ultimately to display in a text table, a surrounding calc is needed to prevent overlapping text. Here's the formula for the Top 5th %ile:
IF FIRST()==0 THEN WINDOW_MAX( [Top 5th %ile Feeder] ) END
This calc only performs the evaluation for the first row in the partition. The Compute Using for this calc is also set to ID, with the nested Compute Usings for Top 5th %ile Feeder and Percentile remaining as above:
The Top 5th calcs can be duplicated and edited to create the additional calcs, so the workout worksheet looks like this:
Then, duplicate the worksheet, move ID from the Rows Shelf down to the level of detail, swap Type & Measure Names on Columns & Rows, and you wind up with this:
Hopefully that meets your needs! The workbook is attached.
Jonathan
Note that as far as I can tell, these totals for Old are correct, I'm not sure what you were using in your Hoped For Result worksheet.

Percentiles jtd edit.twbx.zip 85.0 KB


2. Re: The age old Percentile question
Jonathan Drummey Oct 23, 2012 4:48 AM (in response to Jonathan Drummey)Just to add one more thing, I dove into this problem without really thinking about the desired result, which is a count of the number of results (rows) in each percentile bracket. Creating some measures like for both the Top and Bottom 5th Percentile = SUM([Number of Records]/20, Top 10th Percentile = SUM([Number of Records])/10, will return results that are mostly accurate, with some rounding issues.
Thanks to Richard Leeke for pointing this out to me!

3. Re: The age old Percentile question
.George Prevelige Oct 23, 2012 7:36 AM (in response to Jonathan Drummey)Holy smokes, thanks so much!
I was hoping for a hint or perhaps a table calc snippet, but you provided a complete and elegant solution.
I really own one.
Thanks again,
George