1 Reply Latest reply on Nov 7, 2018 7:01 AM by venkatram

    Distinct count calculation where parameters reference a range of values over multiple rows

    Gordon C

      Hello,

       

      I am new to both tableau and this community, but have looked across the forums and cannot seem to find a similar query to the one I have here.

       

      I have had to use generic terms to describe my data, but the principles are the same as with the real dataset I am working with.

       

      At it's most basic, what I am trying to achieve is to calculate performance for tasks; whilst tasks have a unique reference, they can have multiple sub-tasks (rows in the data) associated with them. Whether or not the unique task has 'fulfilled' the criteria is based on the content of the sub tasks. If the number of sub tasks were 1:1 with unique tasks, or the values within were merely either/or data, I think this would be achievable but my difficulty stems from there being multiple sub tasks per task. The values attributed to sub tasks are from one of 20 pre-defined text-based options, as well as the potential of null. 15 of these values could be considered as 'fulfilled' criteria, and 5 as 'unfulfilled'.

       

      For my performance, I want any fulfilled sub-task to render the overall unique task as fulfilled even if the task also has additional sub-tasks which have not met the criteria. I would also need to count null returns, i.e. if  no sub tasks were created - this event would result in an 'unfulfilled' unique task.

       

      I need the subsequent calculation/ count to be able to distinctly count the number of unique tasks, and represent how many of these unique tasks either did or did not meet the criteria. But with the range of sub tasks potentially including both fulfilled and unfulfilled criteria in the same unique task, I can't find a way to calculate which doesn't produce a result that counts the unique task as both fulfilled and unfulfilled; this effectively over-counts the number of unique tasks attended, and miscalculates the subsequent performance.

       

      Due to the data I am working with I can't upload an extract - but I do realise my terminology probably isn't the most scientific! So I have attached an example of how the data would be categorised and how I would it need to be counted (the 'time' column is not required for the actual calculation, but offers an example of how I would plan to arrange the performance data against other types of category once I complete the initial calculation)

       

       

      Many thanks for any assistance!

       

        Data

       

      Unique Task RefTimeSub Task RefSub Task Grade
      1234Morning564684864531534A
      8888Afternoon<NULL><NULL>
      7744Afternoon1231346C
      1199Night77689A
      1199Night665614B
      2244Night4564764C
      2244Night1321564D
      5588Afternoon878749A
      5588Afternoon656812325646D
      5588Afternoon68984541241C

       

       

      Rules and Reasoning for calculating Performance

      sub task grades A, B meet criteria. C, D, null do not meet criteria

       

      Unique TaskNo. of sub tasksMeets CriteriaReason
      12341Yessub task conforms to A/B rating
      88880Nono sub tasks created
      77441Nosub task conforms to C/D Rating
      11992Yesall sub tasks conform to A/B rating
      22442Noall sub tasks conform to C/D Rating
      55883Yesat least one sub task conforms to A/B Rating even though others have C/D rating

       

      Row based Data:

      10 rows for tasks, 6 of which have unique task references. 9 unique sub task references, 1 null. 4 sub tasks meet criteria, 6 do not meet criteria.

       

      Performance Calculation would show:

      6 unique tasks attended, 3 meet criteria, 3 do not meet criteria, 50% criteria met.