5 Replies Latest reply on Aug 23, 2013 10:36 AM by Allan Walker

    What is the best way to handle large excel files?

    Saurabh Bharadwaj

      Are there any best practices on managing large files? I have 200 MB file that I work with. Pretty straightforward no joins. Is it best to import that into a mysql database for improved performance on Desktop and on the server?

        • 1. Re: What is the best way to handle large excel files?
          Matt Lutton

          I will preface this by saying I don't really know the answer to your question--but, I would think creating an Extract file in Tableau would be all that is necessary.  If you have Tableau Server, you can then set up regular refreshes, or incremental refreshes, etc.

           

          Here is some info you might find helpful: Extracting Large Text and Excel Files

           

          I would also make note of the Excel data reshaping tool, as it could come in handy: Installing the Tableau Add-In for Reshaping Data in Excel | Tableau Software

          1 of 1 people found this helpful
          • 2. Re: What is the best way to handle large excel files?
            Saurabh Bharadwaj

            That is what I am doing now. However, it takes some time to load the file into an extract every time you make a change. Need to know if anyone has experience with working with extract from excel vs. direct connection to sql database.

            • 3. Re: What is the best way to handle large excel files?
              Pedro Machado

              I haven't done any "scientific" benchmarks, but my experience is that if you have a lot of data and a good database server, you are better off connecting directly to the database.

               

              Remember that Tableau sends SQL to the database server, the server does most computations (grouping, sorting, calculations), and sends the summarized data to Tableau for display. I find that the time it takes to send all the detail data is usually a lot longer than the time the database server takes calculating the result set.

               

              The only exceptions I have found to this is when you are joining large tables. A good option is to create a table on the database server that contains the result of the join. Some database engines support "materialized views" that could serve the same purpose, but are easier to maintain.

               

              If you have limited (SELECT only) access the server, then a Tableau extract is a good option.

               

              Pedro

              1 of 1 people found this helpful
              • 4. Re: What is the best way to handle large excel files?
                Toby Erkson

                Excel is NOT A DATABASE!  Since Excel is NOT A DATABASE don't treat it like one.  Yes, put your data into a real database.  The order of precedence is such (starting with the least desirable option):

                text file < Excel < Access < anything else

                One needs to remember that the JET connector used to get Excel and Access data is not that robust any more, at least for larger data sets.  One is MUCH better off connecting to a real database.

                • 5. Re: What is the best way to handle large excel files?
                  Allan Walker

                  +1 Toby.

                   

                  I've been getting very useful performance results using PostgreSQL 64-bit, and I use Navicat 64-bit for inserting tables, the table wizard is really fast, and easy to use.  If you aren't a SQL wizard, the GUI is good for visually showing you the joins.

                   

                  The other benefit of course is that is what underpins T server, and has a spatially aware extension, PostGIS.

                   

                  Best Regards,

                   

                  Allan