5 Replies Latest reply on Sep 4, 2019 3:23 PM by garreth gomilla

    how to do a simple SUM DISTINCT in tableau?

    Al Reynolds

      Hi Guys

       

      I think i might be losing my mind. Trying to do a very simple sum of distinct values in my tableau model. In a nutshell, I am comparing Shipments, Containers and Volume. There are many containers per shipment and volume is specific to shipment, hence when i extract the data from source it creates a duplicate volume entries for each shipment / container combination where there are many containers per shipment.

       

      A sum on volume is producing a distorted number.

       

      In SQL i could do a SUM(DISTINCT(VOLUME) and that would do it. Cant seem to find a easy way in tableau.

       

      Reconstructed a simple model attached.

       

      Thanks in advance for the insight.

       

      Al

       

           

      Container IdContainer NumberShipment IdShipment NumberVolume
      4484FSCU4940492607810617592611.7
      4484FSCU4940492607910628361228.91
      4485CAIU3479938608010636008525.7
      4493YMLU828126960921059325515.67
      4493YMLU828126960931065236904.08
      4493YMLU8281269609410652711624.62
      4496MAGU51150966095106631420222.4
      4497TCNU87101126095106631420222.4
      4498TEMU70080356095106631420222.4
      4493YMLU82812696095106631420222.4
      4494GESU596229660961066408664.81
      4494GESU5962296609710667069060.83
      4495YMLU8204977609710667069060.83
      4495YMLU8204977609810670406547.56
      4499SEGU457353060991067426617.22
      4499SEGU4573530610010674283457.22
      4495YMLU820497761011067429156.4
      4494GESU596229661021067522629.18
      4493YMLU828126961031068884827.56
        • 1. Re: how to do a simple SUM DISTINCT in tableau?
          Zach Pipkin

          Hey Al,

           

          I don't think you're losing your mind. What you are trying to do is something that is intrinsically quite difficult to do in Tableau, and is better handled on the data source level (Either through custom SQL, or setting a flag/identifier field) I mocked up a few sample solutions that might be useful for you, and here's a quick walk through on each of them:

           

          Example 1:

           

          If you weren't concerned about the actual level of detail on the crosstab, you could remove the dimensions until only your unique numbers are left (Shipment Number). Then you can set the aggregation of the Volume to MIN or MAX or AVG.

           

          Example 2:

           

          Using Custom SQL, I performed an INNER JOIN on the data, and counted across the dimension that each volume is associated with:

           

          SELECT [Sheet1$].[Container Id] AS [Container Id],

            [Sheet1$].[Container Number] AS [Container Number],

            [Sheet1$].[Shipment Id] AS [Shipment Id],

            [Sheet1$].[Shipment Number] AS [Shipment Number],

            [Sheet1$].[Volume] AS [Volume],

            [CountRow].[CountD]

          FROM [Sheet1$]

            INNER JOIN

              (SELECT

                  [Sheet1$].[Shipment Number] AS [Shipment Number],

                  count(1) AS [CountD]

              FROM [Sheet1$]

              GROUP BY [Shipment Number]) [CountRow]

              ON [Sheet1$].[Shipment Number] = [CountRow].[Shipment Number]

           

           

          I'm not sure if this is the most efficient method, but after doing so, I was able to get 1 for each unique value, 2 if there were 2 duplicates, 4 if there were 4 duplicates, etc.

           

          I then performed a SUM(Volume) / SUM(CountD) so that we could sum the [Adjusted Volume] to get the actual sum of distinct volumes.

           

          Example 3:

           

          Following on the work of Example 2, I created a custom Grand Total following Jonathan Drummey's technique, and placed the WINDOW_SUM of the [Adjusted Volume] in the Grand Totals Pane, and the regular [Volume] in the worksheet.

           

          Really, it's up to you to determine the best way to do this. If possible, I would try to address this on the database side, that way these unnecessary complications aren't necessary.

           

          Cheers

          1 of 1 people found this helpful
          • 2. Re: how to do a simple SUM DISTINCT in tableau?
            Al Reynolds

            Zach , you are a legend. This works. I actually edited the query before i looked at this and got the results to work. I will go through your examples too as i think they have some great tips on data editing.

             

            Thanks again.

            • 3. Re: how to do a simple SUM DISTINCT in tableau?
              Jonathan Drummey

              I've also been troubled by this problem, and awhile back did my own post asking question at How to calculate sum of distinct values. I did just figure out a solution using the new Level of Detail calculations in version 9. The following LOD calc will return the Volume per Shipment Number:

               

              {FIXED [Shipment Number] : MIN([Volume])}

               

              Then if we apply the SUM() aggregation in a view, the SUM() will be of the Volume per Shipment, not of the Volume for each record. Here it is the Shipment level:

               

               

              Screen Shot 2015-03-11 at 10.32.23 AM.PNG


              And at the container level:

               

              Screen Shot 2015-03-11 at 10.32.28 AM.PNG

               

              The nice part about this is not having to muck about with Custom SQL or having to use table calcs to get the right grand total. It just works! v9 workbook is attached.

               

              Jonathan

              8 of 8 people found this helpful
              • 4. Re: how to do a simple SUM DISTINCT in tableau?
                Mariah Pettapiece-Phillips

                THANK YOU Jonathan!

                I have been struggling with this for an hour, and your solution worked like a charm.

                • 5. Re: how to do a simple SUM DISTINCT in tableau?
                  garreth gomilla

                  I have another question about this detail that is similar.

                   

                  I have (Container ID) that I COUNTD(Shipment Number) and then get the following example

                   

                  Container IdCOUNTD(Shipment Number)
                  44842
                  44851
                  44935
                  44961

                   

                  I want to make a graph based on Total number of container ID per month as the column and then the row will have the total number or COUNTD(ContainerID) with the Individual CountdD(Shipment Number) inside the total number of CountID Bar