0 Replies Latest reply on Mar 15, 2012 7:53 AM by chris.vandenberg

    Sumif of distinct Values

      Hi Everybody


      I have a table that looks similiar as the following:


      Item, Group, Keyword, Date


      Where every item belongs to a group. An item can have multiple keywords, which means that it occurs multiple times in the table:


      ItemID     GroupID     Keyword     Date

      A          1          "Apple"         "1-1-1"

      A          1          "Banana"      "1-1-1"

      B          1          "Kiwi"            "2-1-2"

      C          2          "apple"          "1-1-1"


      What I want is a graph that displays the amount of items under selection over time. If an item is selected this means everything in the group(s) of the selected item(s) is selected, eg: if banana is selected, the amount of items displayed will be two.


      At the moment I've solved this by copying every keyword in a group on every item in a group, creating a table as following:


      ItemID     GroupID     Keyword     Date

      A          1          "Apple"         "1-1-1"

      A          1          "Banana"      "1-1-1"

      A          1           "Kiwi"           "1-1-1"

      B          1          "Apple"         "1-1-1"

      B          1          "Banana"      "1-1-1"

      B          1           "Kiwi"           "1-1-1"

      C          2          "apple"          "1-1-1"


      And then perform countd on the Item column to count the number of selected documents.


      The problem with this method is that it creates a really large table, and I think it would be faster to create two tables, one with the GroupId's and the amount of items in those groups, and one the first one original table. Then make a list of the distinct GroupID's under selection and take sum of the amount of items in that group.


      This is similiar as using the sumif function in Excel.


      I have two questions about this:


      A) Is this possible, and if so: how?


      B) I noticed that tableau already does something intelligent with the data (as the twbx is much smaller then the original csv file), perhaps it already does something similiar internally, making the whole exercise unnecessary, as my main goal is to create a smaller and faster report.