-
1. Re: How to remove duplicate records from extracted data
Hari AnkemAug 13, 2019 2:28 PM (in response to jesh d)
How about providing the sample data as an Excel file? And then let us know the columns you are checking to identify whether something is duplicate or not.
-
2. Re: How to remove duplicate records from extracted data
Chris McClellanAug 13, 2019 3:22 PM (in response to jesh d)
What is the real source of the data ? It might be a lot simpler and easier to fix there.
Do you use SQL, Alteryx or any other tool to manipulate the data before you access it in Tableau ?
-
3. Re: How to remove duplicate records from extracted data
Don WiseAug 13, 2019 3:26 PM (in response to jesh d)
Hello Jesh,
Generally you can accomplish this with a FIXED LOD or you could use Tableau Prep Builder to ETL your data prior to bringing it into Tableau Desktop/Creator https://kb.tableau.com/articles/howto/removing-duplicate-data-in-tableau-prep . The following Tableau Desktop KB article addresses this. https://kb.tableau.com/articles/howto/removing-duplicate-data-with-lod-calculations So you could use something like: {Fixed [Unique Dimension]: MIN([Measure])} to suppress the duplicate rows. Best, Don
-
4. Re: How to remove duplicate records from extracted data
jesh d Aug 14, 2019 6:24 AM (in response to Hari Ankem)I have shared the sample data.It is same as the one in the sample .For duplicate records , only record ID is same and rest of the column values are same .I cannot share the data .Please advice .
-
5. Re: How to remove duplicate records from extracted data
jesh d Aug 14, 2019 6:26 AM (in response to Chris McClellan)I have 13 different csv files which i union in tableau . I have MS access but i didn't use it . I can remove the duplicates by opening each file in excel but i think there might be a easier way to do this in tableau .
-
6. Re: How to remove duplicate records from extracted data
jesh d Aug 14, 2019 6:39 AM (in response to Don Wise)Hi Don wise, as per the below expression :
{Fixed [Dimension 1],[Dimension 2], [Dimension 3]: MIN([Measure])}
In my case , should i put all the dimension/measure names as listed in the sample data .But iam not sure what to put in "MIN([?????]) " is it the record ID ?
-
7. Re: How to remove duplicate records from extracted data
Don WiseAug 14, 2019 6:58 AM (in response to jesh d)
Hi Jesh,
Yes, I'd use a unique identifier as possible. But no, not Record ID; that's what's causing the duplication. Each Record ID is different for each row; too unique. So, a MIN on Employee Number or Product Number. Hopefully there aren't any impacts with any related data involving dates (or amounts or other distinguishing values) in the data and what you're presenting as 'sample' doesn't have any additional columns/fields with unique values.
There's a reason you're getting a Unique Record ID for each row, so if I were you, the first thing I'd be asking to the producer of the data is why? And ask, whether there are additional data elements that you're not aware of which may play into your de-duplication efforts. I.e., if there are dates involved, and each date is different, then you have finely granulated transactional data and that will mean a different level of review prior to implementing a MIN on any field.
A MIN is essentially saying to Tableau, 'Give me the FIRST row record'. A MAX would be the opposite of that, "Give me the LAST row record found'.
Remember, you're not changing anything in the underlying data with Tableau, just reading the data so a bit of experimentation isn't going to hurt so long as you're checking your outcomes for each step or change that you make. Best, Don
-
8. Re: How to remove duplicate records from extracted data
jesh d Aug 14, 2019 8:18 AM (in response to Don Wise)Thanks.I had already asked these question to the makers of the data and confirmed that these records are duplicate s.
Iam able to extract the duplicate records by using the Fixed LOD expression .I will try to explore Tableau prep as well to see if i can remove all the duplicates at the source .Thanks for helping .
-
9. Re: How to remove duplicate records from extracted data
Don WiseAug 14, 2019 8:19 AM (in response to jesh d)
Great, if there's an answer that resolved your issue, please mark that response as correct to close the thread and so that others will find it useful in the future. Best, Don