1 Reply Latest reply on May 2, 2017 9:59 AM by Matthew Risley

    I would like to create a comparison visualization in order to flag entries with potential data accuracy errors (spelling, incorrect address, etc)

    Anita Miller Sackman

      I need assistance in creating a report where I could compare addresses and spelling side by side. My sample data set includes misspelling of addresses or incorrect zip code or states. I would like to easily identify files that I should be reviewing for quality assurance. I have attached the packaged workbook.

      Please let me know if you need any other details. Does anyone know how to go about creating a Tableau Viz that would visually point out potential data accuracy errors? Many Thanks, Anita

        • 1. Re: I would like to create a comparison visualization in order to flag entries with potential data accuracy errors (spelling, incorrect address, etc)
          Matthew Risley

          Anita,

           

          I'm not saying this is impossible, it probably can be done! But I'm going to make a bold statement and say that Tableau is not good at this type of analysis... yet (maybe Project Maestro | Tableau Software has something in there)

           

          Personally, I would create a process outside of Tableau in an ETL tool or programming language (Python is really good at this task and is my favorite). Inside that process I would create a "Reason Column" that is brief and concise. Then you can use that new column, along with the record, and visualize it that way.

           

          Below gets into code. It's outside of Tableau. Don't let it intimidate you if you are not a progammer/coder. It's just to help you attack it from another angle

           

          I've done this with the following basic scenario: Are all records in certain column Numbers-Only?

           

          Python code:

           

          Below is what is called a LIST in Python: simply a way to store data

          x = [9,8,7,6,5,4,3,2,1, 'Not a Number']

           

          Below is called a loop. It checks every element (or "thing separated by comma" in our LIST (called x) and sees if it's a digit, if not - print output of "Record Failed" and the record itself.

           

          for stuff in x:

              # What the str() does is changes it to a string. isdigit() checks if it is a digit.

              if not str(stuff).isdigit():

                  print('Record Failed - ', stuff)

           

          the output will be:

          "Record Failed - Not a Number"

           

          I hope this helps you try and understand how to tackle this outside of Tableau