1 Reply Latest reply on Jun 13, 2018 5:53 PM by Yuriy Fal

    Filtering Underlying Data Based on Percentiles

    Jeffrey Alexander

      I have a situation where I need to filter the underlying data based on percentiles within that underlying data


      So, for example, in my sample workbook, I want to do a filter that only includes players who are above or below the 50th percentile of at bats (please don't focus on 50%, the production version has more categories, but I want to keep it simple).


      Sheet name is PctByTeam


      I want the percentile to be based on all items in the database, even those filtered out


      The following solution would work for me: (ABPercentile on the attached sheet)



      ELSE "TOP"



      However, it doesn't work with my data source (bummer), so I can't use percentile


      So I was trying to do something like



      else  "TOP"



      But this gives an error "all fields must be aggregate or constant when using table calculation fields or fields from multiple data sources"


      If I do something like (ABPercentile-Rank on the attached sheet)


      IF RANK_PERCENTILE(max([AB]), 'asc') <.50 THEN "BOTTOM"

      else  "TOP"



      Then it filters based on aggregate data, not the individual line items


      How do I make an equivalent to the percentile function to filter based on the entire data set?



        • 1. Re: Filtering Underlying Data Based on Percentiles
          Yuriy Fal

          Hi Jeffrey,


          You wrote

          However, it doesn't work with my data source (bummer), so I can't use percentile

          Q: what kind of datasource (connection) are you using?


          You may have options:


          1) If you'd be able to make an extract from it (or maybe export to CSV / JSON),

          then the PERCENTILE() aggregate function would become available (sure you do know).


          2) When connecting live to any SQL-2003 compliant RDBMS (they are many)

          you may have an option to call your DBMS variant of a NTILE window function

          via RAWSQL() code inside Tableau calculation (could be a Row-Level one).



          Besides, there may be other datasources (connections)

          where making an extract -- or a native call via RAWSQL() --

          is not an option (i'm talking cube datasources).

          If this is the case, you're less lucky.