2 Replies Latest reply on Jun 26, 2018 4:34 AM by Timo Karjalainen

    CSV import format documentation

    Timo Karjalainen



      I'm looking for documentation on how exactly Tableau parses CSV import files. Is a technical reference for this available somewhere?


      We have a client who uses Tableau for data analysis and I need to produce CSV files for them to import into Tableau. It works "somewhat" but on some rows, Tableau gets confused about columns and the result is garbage in some columns, with the rest of the columns on that row empty. Analysis results will obviously not be correct.


      The challenge is related to free-text fields that we have in the data. People will input "quotation marks", 'single quotes', semicolons, pipes ('|'), commas, tabulators, and everything else imaginable and even not imaginable things.


      So I'm looking for a specification on how to escape/encapsulate the data so that Tableau will import it as intended.



        • 1. Re: CSV import format documentation

          Hello Timo,


          Have you tried the Data Interpreter option?


          This will help you to review you the imported data.



          • 2. Re: CSV import format documentation
            Timo Karjalainen



            thanks for your response and sorry it took me a while to return to this, I was distracted elsewhere.


            It seems the Data Interpreter does not help here: it says it is not available when there is more than 3000 rows in the data and my CSV dump is currently at 87500 rows. (Also I'm a bit baffled by the idea that the Data Interpreter chooses to be not available if Tableau *itself* thinks it doesn't need help with the data...)


            But back to the original question. How do I format the data to make sure that Tableau sees the column breaks as intended? Can I encapsulate non-numeric (i.e. text) columns in "quotes" for example, and maybe backslash-escape the quote character in the data? Do I then need to protect other characters? It would be very useful to find actual technical documentation on this, not Excel screenshots.


            Alternatively we could switch the data dump format to JSON, it has an exact specification how tokens are interpreted, while CSV is whatever anyone calls CSV.