Sorry to hear you are running into this issue. Would you be able to send me the data for just the field you are grouping or screenshots so I can try to reproduce the issue either via email or through filing a support case?
This sounds like a bug and we'd love to be able to investigate.Thanks,
Senior Product Manager, Tableau Prep
Thanks for getting back to me. I might have accidently stumbled on what the problem is whilst getting the data ready that you asked for. If I’m correct then it would definitely seem like a bug.
It looks like the issue is related to the use of XLSX files. When I did a quick export of the “supplier” field as a CSV so I could try to recreate the problem for you, I found that the issue went away. I tested this further in my original flow and found that using CSVs as the source files instead of XLSX seems to be alright (although I haven’t finished working through it yet).
I’ve attached a flow that directly compares cleaning the field using both a CSV and an XLSX as the source file. For the CSV I was able to use the group and replace function (manual) across the whole field (83 replacements) without affecting the number of rows. However, doing only a fraction of the grouping on the XLSX has increased the row count from 19,831 to 19,887 – that’s 56 rows added by replacing 21 values with 16 values in group and replace.
I’ve attached the test flow and the field data as both a CSV and XLSX so you can test this further. Please note that my flow contains a number of joins that are purely there to check the exact number of rows of data in the field before and after the function was applied. Not sure if there is an easier way to check the row totals!
Incidentally, I've found that this issue has now gone away so I guess it was a bug in an older version.
I am still experiencing this issue even though I am using the most current version of Tableau Prep. Is this issue still under investigation? My workaround is to remove duplicates via Excel after my output is created, but that is obviously not ideal.