1 Reply Latest reply on Aug 25, 2017 7:17 AM by John Croft

    How to clean large data (excel) in a proper way?

    Chris Giga

      If you look at the data sets sample from FBI as excel:

       

      https://ucr.fbi.gov/crime-in-the-u.s/2013/crime-in-the-u.s.-2013/tables/1tabledatadecoverviewpdf/table_1_crime_in_the_un… 

       

      It looks very badly structured. In  this case you could easily edit in a manual way. But how to deal with large data?

       

      Can Tableau automatically etect such simple written sentences in the excel file? Or split the tables if ther are many tables stackled like in this case?

       

       

        • 1. Re: How to clean large data (excel) in a proper way?
          John Croft

          For the example provided, if you upload the excel file as a data source and check the 'clean with data source interpreter' you get pretty good options.

           

          1) a large table it tried to create. Not exactly what you want because of the notes and sentences between the two tables.

          2) two separate sub tables that reflect exactly what the two sub tables in the file are. No notes or sentences that I saw. Looks like you can union these together if needed.

           

           

          I think addressing this in Tableau with 'larger data sets' depends on more than just if Tableau can identify sub tables nicely stacked in Excel. My first attempt, if the data interpreter did not work, would be to look at a python script to parse what you need into separate file or separate tabs in the same file and then load that file to Tableau.