8 Replies Latest reply on Jan 22, 2016 7:16 AM by Kent Sloan

    Identifying Duplicate Entries

    Natalie Vu

      Hello,

       

      I am having trouble identifying duplicate records in my data. The duplicates are for a combination of two fields.

       

      For example, I have a date (dimension) and an amount (measure). If there are two or more similar entries, I would like to identify them. What would be the best way to create the data needed?

        • 1. Re: Identifying Duplicate Entries
          Rajeev Pandey

          Dear Natalie ,

           

          Please find the below approach and let me know your thoughts on this.I have taken a sample data for this scenario.

          You can drag this field to filter shelf and select Duplicate.

           

           

          You can also use Lookup function for solving this issue.

          if it solves the issue , Kindly mark the answer as correct answer

          • 2. Re: Identifying Duplicate Entries
            Kent Sloan

            Hi Natalie,

             

            Rajeev's solution will work if you are looking for duplicates between columns. If you are looking for duplicate records such as the same date and amount on more than one row the solution below using ID and Date should work.

             

            1. Combine the 2 fields using the string function STR() to wrap around the date and measure.

            2. Add the combined field and to the rows shelf followed by the original fields with the date set to MDY.

             

            3. Add number of records to the columns shelf, set marks type to bar, display the labels, and sort descending. The duplicate records will be displayed at the top.

             

             

            Thanks,

            Kent

            InterWorks

            • 3. Re: Identifying Duplicate Entries
              Natalie Vu

              Thank you for the help. I am actually trying to identify the duplicates in

              various rows. Please see the example below.

               

              I would like to identify duplicates in date and amount. In the data below,

              it should only pull the line item for 10/7/2015 and 173.24 (since there are

              two (multiple entries).

               

               

               

              On Thu, Jan 21, 2016 at 10:13 AM, Rajeev Pandey <

              • 4. Re: Identifying Duplicate Entries
                Kent Sloan

                Hi Natalie,

                 

                Replace ID with Amount in USD in my example above and you will find the duplicates. You could also add Sum of Number of records to the filter and set it to at least 2 to show only the duplicate records.

                 

                Thanks,

                Kent

                • 5. Re: Identifying Duplicate Entries
                  Rajeev Pandey

                  Kent Sloan:i could not able to understand the purpose of your Ist step.Would u please explain why you opted this String approach . 

                  i am not able to understand how it is showing the duplicate value when there is no duplicate row is there

                  • 6. Re: Identifying Duplicate Entries
                    Kent Sloan

                    Hi Rajeev,

                     

                    So in my example if I only wanted to know if there is more than one record for a given ID I could have used just ID. But what if in my data set I have records over time, I might expect multiple records for an ID just not more than one record per day. By concatenating these values I can create a view with a single row for each combination. By sorting this by number of records the duplicates move to the top.

                     

                    In this image showing Date by ID we have no idea if there are 1 or 10 records for a given ID on a day.

                     

                     

                    I can just add in number of records to the view but I have to scroll through the list to find the duplicates. By creating a field that combines these fields I get unique rows that I can sort.

                     

                    Thanks,

                    Kent

                    • 7. Re: Identifying Duplicate Entries
                      Rajeev Pandey

                      That make sense Kent Sloan. Thanks for the solution.

                      The only thing which is not clear how we will get the duplicate records across multiple columns 

                      As per your example , user will able to see the duplicate number of Records which is absolutely fine but what i m interested in "how to find the duplicate values across multiple columns. Would it be great if you throw some light on this

                       

                      • 8. Re: Identifying Duplicate Entries
                        Kent Sloan

                        Hi Raveej,

                         

                        Duplicate rows is an issue I see on a regular basis. I have never had to deal with duplicate cases across columns. If there are duplicate records across multiple columns why would you not just use one column and ignore the other? If this was an issue that I faced I would look at restructuring the data to combine the two. If there was a case with duplicates across the columns in a row you could use an IF statement or T/F to check for it.

                         

                        Thanks,

                        Kent