7 Replies Latest reply on Jan 19, 2018 8:28 AM by Joe Oppelt

    Rank Percentile within a window of certain size

    Tien Phung

      Is there a way to use rank percentile for just a certain number of entries. For example, the percentile rank of a certain cell it's rank within itself and 26 rows before it. The window would move as we go down a column.

        • 1. Re: Rank Percentile within a window of certain size
          Joe Oppelt

          It would really help to have a sample workbook that simulates what you are doing. 


          Tableau keeps an internal value called INDEX() on the cells in the table, and you can tell RANK (or any table calc) to act on specific range of index values.  Some let you specify offsets or start- and end- ranges.  RANK will need to have "IF" logic within the parentheses to specify what to include in the ranking operation.

          1 of 1 people found this helpful
          • 2. Re: Rank Percentile within a window of certain size
            Tien Phung

            Can you give me a sample of what would be in the calculated field? I have an excel file attached. Suppose I wanted to find the percentile rank of the value of -0.855618503 within the frame of it and 26 weeks before May3rd. For the value -0.00585069, I want to find it's percentile rank within the window of 26 weeks before April 26th.

            • 3. Re: Rank Percentile within a window of certain size
              Joe Oppelt

              Work up a sample sheet for what you are looking to do.


              I ask you to do that because any time I guess as what the other person wants, I generally get the response, "That's not what I had in mind..."  Further, if you start the ball rolling, I will work in the Tableau version you are using, rather than doing it in a version you find you can't open.

              • 4. Re: Rank Percentile within a window of certain size
                Tien Phung

                Sorry I'm new to this but does this help? My explanation is above.

                • 5. Re: Rank Percentile within a window of certain size
                  Joe Oppelt

                  Moving forward, you'll want to do:


                  FILE -> Export Packaged Workbook


                  ... to create a .twbX file.  That will include the data source in the workbook file.  When you just upload a .twb file, it still points to the data source on your computer.  In this case it was OK because you're just using an excel file, and I downloaded yours and told the workbook to look for it on my computer instead.  But some day you'll have an extract or other data source that can't swap in so easily.


                  For now we're fine.


                  So in the attached (which I saved as a .twbX file) we have your original sheet.  I made a copy (sheet 2) and added some stuff.  First I added a calc called [index].  It's just using the internal index() function, but I added it to the sheet so we can see what is in Tableau's mind.


                  I also created a parameter so you can pick which week you want to make the rank relative to.  Using the parameter, I created a calc to determine the index value for the week in question.  see [Start index].  But that only puts the value on one of the rows.  We need that value on every row so that the RANK calc can use it to determine which rows to rank.  So I wrapped the code from [Start index] in the WINDOW_MAX() function.  That tells Tableau to take the max value of what's inside the parentheses, and put it on every row.  In this case only one row will have the [Start index] value, so that's the max value, and now that shows up on every row.  you see that on Sheet 2 as well.


                  Go to Sheet 3.  I moved some of the fields off the sheet and added some others.  Look at [Rank 26 weeks].  This uses the captured index value to compare row-by-row to the respective index value for that row.  if it's within the 26 week range, include it in the rank operation.  I have displayed that on the sheet.  Change the date in the Pica-a-date parameter, and you will see that the 26 rows that get ranked move on the sheet based on the selected value.  (Pick something that has fewer than 26 weeks before it, and it will only rank as many rows as there are in the data.)


                  But you really don't want to see all 26 values.  You just want to know what the rank value for the selected week is.  So I created one more calc:  [Just get the value for the week in question].  That just displays the value for the specified week.


                  Go to Sheet 4.  I moved everything off the sheet except for the [Just get the value...] calc.

                  • 6. Re: Rank Percentile within a window of certain size
                    Tien Phung

                    Thank you so much!


                    As a follow up, how would the calculations be different if I wanted to do this for all dates, but still within the 26 week frame?

                    • 7. Re: Rank Percentile within a window of certain size
                      Joe Oppelt

                      So essentially you want to see a value of 23 on the May 3 row, and a value of 17 on the April 26 row, etc,, all the way up the chart.  Correct?


                      I'm not sure I can make that happen.


                      RANK is a table calc.  It gets evaluated for all rows in the table at the same time.  (That's why, for example, you see [Start Index Value] get propagated on all rows on Sheet 3.)  What you'd essentially need is for each of these table calcs to get evaluated simultaneously on the table for each row's index value, which is not how Tableau is built.


                      The one thing I can think of is if we could have an additional dimension in the data that would propagate x-many rows for each existing data row, where X is the number of weeks in your data.  (Maybe we would only need 26 instances of each row.)  Then I might be able to force 26 instances of the table calc, one along each iteration of the new dimension.  If your data set isn't very large, this might work.  If you have tons of rows in the data, this idea would cause your data source to multiply in size.  It might also end up being expensive for performance.


                      But conceptually I can see how it might work.  Doing it for real is another question.