1 Reply Latest reply on Mar 27, 2014 12:50 PM by LyNea Tomas

    blended data with count distinct

    Sarah Aiello

      I've run into a problem that I was hoping someone would have a creative (or obvious) solution for.  I duplicated my data source and I want to blend the data (so that I can run aggregates in one column and filtered values in a second).  I have a number of filters that I would like to apply to both columns, however I don't want to include this data in table itself.  If I understand correctly, this means that I can only use count() on the non-primary source, not countd().  Unfortunately, this gives me duplicate rows (and so the wrong count distinct) because a user can exist in multiple time periods, and any range of time periods can be selected.

       

      In case it would help to have a more concrete explanation:

      There is a pet info company split into three work groups (feline, canine, reptile).  Each group has pamphlets related to their specific group that they can distribute to customers.   In the database, every time an employee distributes a pamphlet, this is recorded as a new line.  If they don't distribute a pamphlet, they are recorded as null

       

      In the dashboard, I want to see what % of users in a given month that could have given a pamphlet actually gave a pamphlet (subject to date, region, and employee type filters).  Pamphlet should also be filterable, however this should only affect the count of employees who gave a pamphlet (not the count of employees overall)

       

      I duplicated the data and pulled a count distinct of "Employee  who gave pamphlet" from the original joined to a count of "employee" (anyone who could have given a pamphlet).  The problem is that I must use count for employee (not count distinct) and since employees have at least one line for each row, they will be double counted if they were working for the company for more than one of the months chosen.

       

      Does anyone have any ideas of how I could make this work?  More specifically

      1) Are there any creative ideas of how to get a count distinct when there are duplicate rows across dates?

      2) Is it possible to keep a row (such as "null") and not give the end user the ability to filter away that row?  If I could do this, I could probably also create this dashboard, but it doesn't seem possible.

       

      I'm thinking it is not possible at this point, but if anyone can find a solution I'm sure it will be this community!