4 Replies Latest reply on Sep 30, 2016 2:15 AM by Charlie Giersh

    De-duplicate overlapping records based on type and price

    Charlie Giersh

      I've got a series of data where there are duplicate (overlapping records), but they have a different Price Type and Price.

       

      I'm trying to count the number of records by price type based on the lowest price over the sale weeks, however I do not want to double count records with a different type.  Below is an example of my data (I've highlighted the records that should be counted in bold font).

       

      Sale WeekItemPrice TypePrice
      Week 1ChairSale Price$8
      Week 1ChairOnline Price$5
      Week 1TableFull Price$20
      Week 1TableSale Price$15
      Week 2ChairFull Price$10
      Week 2ChairSale Price$8
      Week 2TableSale Price$20
      Week 3ChairFull Price$10
      Week 3TableSale Price$15
      Week 3TableOnline Price$10
      Week 4ChairFull Price$10
      Week 4ChairOnline Price$5

       

       

      The results I'm trying to see would be:

       

      ItemCount Online Price
      Count Sale PriceCount Full Price
      Chair211
      Table120

       

      I can find the item with the lowest price in any given week with a FIXED function e.g. "{ FIXED [Sale Week], [Item] : MIN([Price] }" but cannot work out how to determine the Price type for the that specific record and then count them.

       

      I've also looked at trying to filter out the overlapping records first, based on the criteria, but again cannot seem to get the logic to work.

       

      Any help most welcome.

        • 1. Re: De-duplicate overlapping records based on type and price
          Siraj Alimohamed

          Hi Charlie,

           

          I think you were in the right direction, just one more of calculated field for getting the Count, and you would have cracked it. Here are the calculations I created:

           

          Lowest Price

          {FIXED [Item], [Sale Week]: MIN(Price)}

           

          Count of Products Sold

          IF {FIXED [Sale Week], [Item], [Price Type]: MIN([Price])} = [Lowest Price]

          THEN 1

          ELSE 0

          END

           

          Those two together, got me the result you were looking for.

           

           

          Can you check if this helps please?

           

          Cheers

          Siraj

          2 of 2 people found this helpful
          • 2. Re: De-duplicate overlapping records based on type and price
            Jonathan Drummey

            Here's how I'd go about this, it's a little different from what @Siraj did and checks for a few of potential problems with LOD expressions.

             

            1) Create a Lowest Price calc that is {FIXED [Item], [Sale Week] : MIN([Price])}. This is the same as what Siraj did.

             

            2) Create a Count of Products calc with the formula:

            IF ROUND([Price],2) = ROUND([Lowest Price],2) THEN

                1

            ELSE

                0

            END

             

            This is different from Siraj's calc in two ways:

             

            a) Given this the data set {FIXED [Sale Week], [Item], [Price Type]...} is returning a record-level result, so that LOD is actually not needed and we can just use the record-level price. This will make the calc faster.

             

            b) ROUND() is used to guarantee that we are comparing the same values. Tableau does not guarantee that an LOD on a floating point number (such as a currency amount with decimals) will be the exact same value as a record-level or regular aggregate result, and I've verified that this is the case in real world data.

             

            There's one more possible issue that can arise and that's if there are ties where multiple Price Types all have the same (lowest) price value. In that case the above calculation would double-count products for the price type.

             

            If you don't want that to happen then we need to somehow resolve the tie in favor of a given Price Type. I set up an example Count of Product (ties) calc that embeds another FIXED LOD to get the first Price Type alphanumerically and only use that for the count:

             

            IF {FIXED [Item], [Sale Week] :

                MIN(IF ROUND([Price],2) = ROUND([Lowest Price],2) THEN [Price Type] END)}

                = [Price Type] THEN 1 ELSE 0 END

             

            Here it is in a view:

             

            Screen Shot 2016-09-29 at 1.59.07 PM.png

             

            I've attached a v9.3 workbook.

             

            Jonathan

             

            PS: @Charlie, when posting sample data to the forums please attach CSV or Excel data, that makes it easier for us to work with your data, thanks!

            2 of 2 people found this helpful
            • 3. Re: De-duplicate overlapping records based on type and price
              Siraj Alimohamed

              You nailed it! Jonathan Drummey! I definitely overlooked that the second query was bringing a record level dataset anyway.

              • 4. Re: De-duplicate overlapping records based on type and price
                Charlie Giersh

                Thanks to both of you for your quick answers, this is a big help.

                 

                I've followed Jonathan's suggestion, as my data includes data with ties where multiple Price Types all have the same lowest price.  I also have records with the same price type and same lowest price, so I added calc so they aren't double counted.

                 

                @Jonathan thanks also for pointing out the LOD rounding issue, I'll bear this in mind in future.

                 

                Cheers,
                Charlie.