4 Replies Latest reply on Sep 30, 2016 2:15 AM by Charlie Giersh

De-duplicate overlapping records based on type and price

I've got a series of data where there are duplicate (overlapping records), but they have a different Price Type and Price.

I'm trying to count the number of records by price type based on the lowest price over the sale weeks, however I do not want to double count records with a different type.  Below is an example of my data (I've highlighted the records that should be counted in bold font).

Sale WeekItemPrice TypePrice
Week 1ChairSale Price\$8
Week 1ChairOnline Price\$5
Week 1TableFull Price\$20
Week 1TableSale Price\$15
Week 2ChairFull Price\$10
Week 2ChairSale Price\$8
Week 2TableSale Price\$20
Week 3ChairFull Price\$10
Week 3TableSale Price\$15
Week 3TableOnline Price\$10
Week 4ChairFull Price\$10
Week 4ChairOnline Price\$5

The results I'm trying to see would be:

ItemCount Online Price
Count Sale PriceCount Full Price
Chair211
Table120

I can find the item with the lowest price in any given week with a FIXED function e.g. "{ FIXED [Sale Week], [Item] : MIN([Price] }" but cannot work out how to determine the Price type for the that specific record and then count them.

I've also looked at trying to filter out the overlapping records first, based on the criteria, but again cannot seem to get the logic to work.

Any help most welcome.

• 1. Re: De-duplicate overlapping records based on type and price

Hi Charlie,

I think you were in the right direction, just one more of calculated field for getting the Count, and you would have cracked it. Here are the calculations I created:

Lowest Price

{FIXED [Item], [Sale Week]: MIN(Price)}

Count of Products Sold

IF {FIXED [Sale Week], [Item], [Price Type]: MIN([Price])} = [Lowest Price]

THEN 1

ELSE 0

END

Those two together, got me the result you were looking for.

Can you check if this helps please?

Cheers

Siraj

2 of 2 people found this helpful
• 2. Re: De-duplicate overlapping records based on type and price

Here's how I'd go about this, it's a little different from what @Siraj did and checks for a few of potential problems with LOD expressions.

1) Create a Lowest Price calc that is {FIXED [Item], [Sale Week] : MIN([Price])}. This is the same as what Siraj did.

2) Create a Count of Products calc with the formula:

IF ROUND([Price],2) = ROUND([Lowest Price],2) THEN

1

ELSE

0

END

This is different from Siraj's calc in two ways:

a) Given this the data set {FIXED [Sale Week], [Item], [Price Type]...} is returning a record-level result, so that LOD is actually not needed and we can just use the record-level price. This will make the calc faster.

b) ROUND() is used to guarantee that we are comparing the same values. Tableau does not guarantee that an LOD on a floating point number (such as a currency amount with decimals) will be the exact same value as a record-level or regular aggregate result, and I've verified that this is the case in real world data.

There's one more possible issue that can arise and that's if there are ties where multiple Price Types all have the same (lowest) price value. In that case the above calculation would double-count products for the price type.

If you don't want that to happen then we need to somehow resolve the tie in favor of a given Price Type. I set up an example Count of Product (ties) calc that embeds another FIXED LOD to get the first Price Type alphanumerically and only use that for the count:

IF {FIXED [Item], [Sale Week] :

MIN(IF ROUND([Price],2) = ROUND([Lowest Price],2) THEN [Price Type] END)}

= [Price Type] THEN 1 ELSE 0 END

Here it is in a view:

I've attached a v9.3 workbook.

Jonathan

PS: @Charlie, when posting sample data to the forums please attach CSV or Excel data, that makes it easier for us to work with your data, thanks!

2 of 2 people found this helpful
• 3. Re: De-duplicate overlapping records based on type and price

You nailed it! Jonathan Drummey! I definitely overlooked that the second query was bringing a record level dataset anyway.

• 4. Re: De-duplicate overlapping records based on type and price

Thanks to both of you for your quick answers, this is a big help.

I've followed Jonathan's suggestion, as my data includes data with ties where multiple Price Types all have the same lowest price.  I also have records with the same price type and same lowest price, so I added calc so they aren't double counted.

@Jonathan thanks also for pointing out the LOD rounding issue, I'll bear this in mind in future.

Cheers,
Charlie.