1 Reply Latest reply on Feb 17, 2012 6:16 AM by Alex Kerin

    Joins cause incorrect values with aggregated measures (Can I specify aggregation before I calculate a sum?)

    Steven Hayhurst

      GOAL

      I'm having difficulty showing the correct sum of contract amounts for each vendor when my data source consists of a contracts table joined to a change orders table.

       

      I'd like to achieve this with a single data source so that I can deploy a regular filter (not a worksheet that looks like a filter to get around the multiple data source issue) that would work across the following worksheets deployed on the same dashboard:

      • Vendor summary worksheet (e.g. number of contracts and sum of value of contracts for each vendor)
      • Contract worksheet (e.g. contract attributes as well as sum of change order amounts associated with the contract)
      • Change order details worksheet (e.g. change order attributes and values)

       

      NOTE ABOUT DATA SET

      In a single data source in Tableau, I've joined a contracts table and a change orders table on ContractUID. The join looks like this: Contracts LEFT JOIN ChangeOrders on ContractUID.

       

      There are often many change orders associated with a single ContractUID.

       

      For each ContractUID there is only one Project and one Vendor associated with it.

       

      PROBLEM

      Because there are in many cases multiple change orders per contract, joining the contracts table to the change orders tables causes the sum of contract amount to be higher than expected.

       

      ATTACHED FILES

      I've attached workbook with sample data illustrating the issue I've come across.

       

      CONCEPTUAL DESCRIPTION OF SOLUTION

      First, find AVG(ContractAmount) for all records with the same ContractUID (we know there may be several records for each ContractUID because of the join with the change order tables as noted above) then take the sum, with Vendor as the dimension.

       

      I could use also the MAX, MIN, or ATTR functions instead of AVG in this case; it doesn't matter because my data set will have only one distinct value for ContractAmount for each ContractUID.

       

      HOW THIS IS SOLVED IN QLIKVIEW

      In the case you happen to be familiar with QlikView, there are functions that allow you to specify aggregation before you calculate a sum. In this case, in order to get the correct sum of ContractAmount for a vendor, you would first aggregate your values for ConractUID and then calculate the sum, with Vendor as the dimension. The syntax in QlikView would be SUM(AGGR(AVG(ContractAmount), ContractUID)).

       

      ARE TABLE CALCS WHAT I NEED?

      I'm not sure where to go from here in Tableau... table calculations perhaps? Any help would be greatly appreciated!

        • 1. Re: Joins cause incorrect values with aggregated measures (Can I specify aggregation before I calculate a sum?)
          Alex Kerin

          This can be done with Table Calcs - I've got something working but I'm not sure it's optimal - I'm on 7 so here are the calcs:

          To remove the dupes - [Contract Sum Remove Dupes]:

           

          ((if index()>1 then null else attr([ContractAmount]) end))

           

          And to sum the remaining:

           

          window_sum([Contract Sum Remove Dupes])

           

          We also need a set of ContractUID and Change Order ID

           

          You can then add this, and edit the table calcs, using the dropdown to set the calc to remove the dupes to Change Order ID (for each vendor, only look at the first line of any change order), and then sum this across all of the ContractUID and ChangeOrders (the set). Others may have better approaches to this.

           

          window.png