7 Replies Latest reply on Jan 8, 2017 1:31 PM by Christina Gremore

    Return records within the 0.25 percentile?

    Benjamin Cole

      Hi Everyone,

       

      I've been creating box and whisker plots and want to filter out the lower 0.25 percentile of the data.

       

      How do I use the function PERCENTILE([Sales], 0.25) for example to return all employees whose sales were in the 0.25 percentile?

       

      Thanks!

      Ben

        • 1. Re: Return records within the 0.25 percentile?
          Christina Gremore

          Hi Benjamin,

           

          Depending on how your data is structured, you'd probably want one of the following:

           

          SUM(Sales) >= SUM(PERCENTILE(Sales,.25))

           

          or

           

          [Sales] >= PERCENTILE([Sales],.25)

           

          I think the pre-aggregated version (the first one) is more likely to give you an accurate outcome, but I'm not 100% certain. Maybe you'll need to do some stuff with an LOD to "fix" it at the 'employee' dimension? Without a packaged workbook and mockup of your data, tough to say for sure.

           

          Then you can put this field on the filter shelf and keep True (to exclude those who fall within the 1st quartile).

           

          If this resolves your question, please mark my answer as correct. Thanks!

           

          -Christina

          • 2. Re: Return records within the 0.25 percentile?
            Benjamin Cole

            Hi Christina,

             

            I'm with you, I'm actually interested in creating a boolean returns true if the record is in the 0.25 percentile.

             

            1. Im confused what Percentile([Sales], 0.25) will actually return

            2. not sure why SUM(Sales) >= or [Sales] >= is a meaningful boolean.

             

            This is what my data looks like

            Screen Shot 2017-01-05 at 2.14.36 PM.png

             

            Thanks again!

            • 3. Re: Return records within the 0.25 percentile?
              Christina Gremore

              Hm, ok. So to be clear - the structure of your data is that there is only one row per employee, that shows a summation of their sales?

              • 5. Re: Return records within the 0.25 percentile?
                Christina Gremore

                OK, in this case, you would want the following calculation:

                 

                [Sales] <=  { PERCENTILE(Sales,.25) }

                 

                Definitely pay attention to the curly brackets there! What they are doing is fixing that percentile equation across the entire data set. If you're still not sure what it's doing, I recommend creating a separate calc with just { PERCENTILE(Sales,.25) }  in it and seeing what it outputs.

                 

                In the sample data set you provided, the 25th percentile of sales is 150. Let's look at how this calc evaluates each row of data.

                 

                Employee A: Their sales are 200. Is 200 less than or equal to 150? No, it's not. So this is FALSE.

                Employee B: Their sales are 300. Is 300 less than or equal to 150? No, it's not. So this is FALSE.

                Employee C: Their sales are 100. Is 100 less than or equal to 150? Yes, it is. So this is TRUE. Employee C is in the bottom quartile.

                 

                That's what makes the calculation above a meaningful boolean - because it evaluates each row of data (to wit, one employee) against the 25th percentile for the whole domain of values.

                 

                Whether you want this statement to say 'true' if they're in the bottom quartile or 'false' is just a matter of which way you point the "less than" sign. When you flip it from > to < you'll get a complementary set of results, so it's not necessarily important here.

                • 6. Re: Return records within the 0.25 percentile?
                  Benjamin Cole

                  Thanks so much for that entire explanation!

                   

                   

                  Let me see if I got this right,

                   

                  Percentile([Sales], 0.25)  will return the bottom 0.25 percentile of whatever the view is aggregated to?

                  { PERCENTILE(Sales, 0.25) } is an LOD calc? that isn't fixed to a specific dimension and thus fixed to the level of the entire data set?

                   

                  Example

                  { FIXED [Region] : SUM([Sales]) }  -- sum of sales per region irrespective of the view?

                  {SUM([Sales])} - sum of sale per entire data set irrespective of the view?

                   

                  THANKS!

                  Ben