6 Replies Latest reply on Jul 22, 2011 10:40 AM by Joe Mako

    Rank calculation problem

    Andy Cotgreave


      I'm trying to produce tables of survey data, ranking Colleges within each question asked. My table shows Colleges along the columns, and the question(s) on the rows. Students answered whether they were satisfied or not to each question. Instead of showing the percentage satisfaction, I want to show each college's RANK compared to other colleges for that question.


      I have pretty much got the calc fields correct, but one question doesn't work properly. If you look at the attached workbook, when you select just the "Welcome/pick up at airport" question, the ranking is fine - you can see this on the "Just ranks" and "Measure Values" sheets. However, if you change the filter to show All values, things go wrong - the "Welcome" question's ranks are incorrect. This problem only occurs with this question; the other questions work on single-select or (All) filter choices.


      There are two index calcs. "Index (TCRL)" is taken from the TCRL forum section. "Index (Joe modified)" is a little tweak recommended by Joe Mako. Both return different and incorrect results when (All) is chosen. Their partitioning is slightly different, but is basically computed along College (sorted descending by % satisfied) for each Short question.


      Can anyone work out why the ranking breaks on just one question depending on the filter choice? BTW - I've only extracted three questions; there are many more in my real data, and the problem seems only to occur on this one question.




        • 1. Re: Rank calculation problem
          Joe Mako

          Andy, I took another glance at it, and I believe the source of the frustration is the AGG() pill you have on your filter shelf.


          To get the results you want, and the filter you want, you will need a different calculation.


          Instead of what I recommend earlier, just change the formula for "% satisfied" from:



          sum([Answer (1 or 0)])/count([Answer (1 or 0)])





          IF COUNTD([Person Number])>=10 THEN
           SUM([Answer (1 or 0)])/COUNT([Answer (1 or 0)])


          Note: (sorry for the poor advice earlier, what I recommended before: "-(SIZE()-TOTAL(COUNTD([College])))" should not be in your rank/Index calculation)

          • 2. Re: Rank calculation problem
            Andy Cotgreave

            Great stuff Joe, that works perfectly. This technique is really powerful. I've added a parameter to the workbook such that the user can switch between % satisfaction or rank as they wish. This kind of thing could be applied to any survey data where you've got multiple questions and a bunch of respondent types.


            Regarding why the calc field needs the countd check: I assume that's because the Table Calculation is done before the filtering? Therefore, the ranks are calculated and then the values with fewer than 10 respondents are removed.



            • 3. Re: Rank calculation problem
              Joe Mako

              The reason why the check for "COUNTD([Person Number])>=1" needs to be done in the calculated field instead of on the Filter shelf with an AGG() pill is because of sorting. Sorting for the table calculation is done before the AGG() filter is applied. So for that "Welcome" question, where there a bunch of colleges that did not meet that filter requirement, it had a bunch of null values introduced because of that filter, but the sort order did not change, so the ranking calculation saw a null value as the previous for colleges and the logic says if the previous is null, then return "1".


              This is why my question back on http://www.tableausoftware.com/blog/six-education-day#comments is important. If you do not know the order of operations, then you can get into situations like this where it is not clear what is going on and why you are not getting the results that you expect. I do not believe the answer provided in that comment thread to my question is complete. I do not know what the full answer is, but I know some pairs, like sorting for table calcs happens before AGGG() filters are applied.


              I guess this leads to a question for Tableau developers: should table calc sorting happen before AGG() filters are applied?

              • 4. Re: Rank calculation problem
                Michael Nealey

                This is great stuff.


                I wonder is there a way that the table (or chart - as I have used these calculations to display in bar format) will update automatically?

                When I change between questions, the dimension order stays the same and does not update to reflect the order of who is ranked 1, 2 etc

                • 5. Re: Rank calculation problem
                  Michael Nealey

                  It's OK - I figured it out using computed sorts!

                  • 6. Re: Rank calculation problem
                    Joe Mako

                    Be aware that the method I demonstrated above is not the best way in my current opinion to accomplish this. See Re: How do I calculate a field for each day (that is repeated for each item)? for another example.


                    In short, if you use create a Set of the dimensions in use, and place this Set pill on the level of detail shelf, it removes the need for all the extra table calc stuff in this formula:



                    IF NOT ISNULL([% satisfied]) THEN
                     IIF([% satisfied] == LOOKUP([% satisfied], -1),PREVIOUS_VALUE(1), INDEX(), 1)


                    and you can go back to using just





                    (both use an advanced compute using)


                    The Set is removing the padding that Tableau adds to the data before evaluating the table calcs. sometimes this data padding is desirable, eg forecasting, and other times, like this ranking situation, it is not desired.


                    Also you can use table calcs to visually sort the display of data, and if you want a table calc sorted by the result of a table calc, eg a raking of an aggregate of table calc results, one option is to use a self-joined data set.


                    attached is an example.