9 Replies Latest reply on Mar 26, 2012 10:12 PM by Kelly Martin

    Extract getting slower on each refresh?

    Kelly Martin

      Hi there,


      I am currently using Desktop (no Tableau Server), running against a SQL view, with a simple where statement:

      WHERE work.WORK_DT between '2011-11-01' and '2012-02-14'.



      Running this script in sql only takes about 4 minutes and returns approximately 65 columns and 85k rows each week.  Initially, the extract only took approximately 10-15 minutes to run in Tableau.  It seems to be taking longer each week (up to 40 minutes now).  I don't have the option of incremental updates - need to refresh the entire 3 months + current month to date. 


      Is it possible that the addition of calculated fields and parameters increases the refresh time? 

      Does having other data sources in the workbook increase the refresh? 

      Should the date range not be part of the SQL - but part of the extract filter?


      Any help with optimizing this extract would be greatly appreciated.  Thanks!

        • 1. Re: Extract getting slower on each refresh?
          Logan Riggs

          Hi Kelly, I recommend sending your logs and workbook in to Support.  They should be able to find out where Tableau is spending it's time. 


          If you are feeling adventurous you could look into log.txt, tabprotosrv.txt and tdeserver.txt and use the timestamps in the files to search for any large gaps.  Support has tools to do this automatically and they could help decipher the log statements to see what Tableau is doing.


          You might also try creating a new workbook with the same connection/SQL and see how long that extract takes.


          How many calculated fields do you have?  If you had hundreds it might affect the extract speed, otherwise it shouldn't be a problem.  If you are using complicated if/then or case statements with string columns, for example to do formatting, these may take a long time to optimize when doing the extract.


          I think that using the where clause in your custom SQL is the right way to go.  I wouldn't expect a filter on the extract to be significantly faster or slower.


          The extract is done per datasource so I don't think that should affect things.

          1 of 1 people found this helpful
          • 2. Re: Extract getting slower on each refresh?
            Kelly Martin

            Thank you so much Logan.  I don't have hundreds of calculated fields; maybe 20 and same for parameters.  I'll try creating the new workbook and if there's still an issue, I'll contact support.  I appreciate your feedback as I hate to go to support first when the problem is likely my lack of expertise.



            • 3. Re: Extract getting slower on each refresh?
              Russell Christopher

              Hey Kelly --


              Where possible, Tableau actually pre-calculates and "saves" the results of your new calculations inside the extract (we call this "optimization") . It sometimes is more efficient to have those values pre-calculated and ready to consume vs. "thinking" in real time as a report is being rendered.


              Ever noticed the "Optimize" option in the extract menu? "Optimize" means "Tableau, please go ahead and calculate the results for all my calculations and save them in the extract just like the 'real' fields from my data source".


              If I had to guess, as you add new Calculated Expressions to your workbook (and extract), we're dutifully adding columns of results to the extract itself....so they're getting bigger. Optimization does take extra time, but it's hard to say whether it's completely responsible for your longer extract time. It'd be interesting to test your extract time by removing 30% / 60% / 100% of your calculated expressions and see how perf improves.


              Let us know what happens!

              1 of 1 people found this helpful
              • 4. Re: Extract getting slower on each refresh?
                Kelly Martin

                Thanks for your help Russell - there we a number of unused calculated fields that once removed did improve things considerably.   I have been using the Optimize option regularly as it seems I am constantly making new calculations (investigating a brand new data source).   I did contact the Tableau support folks and they gave me a number of recommendations as well.  I removed a lot of unused calculations, a set, a bunch of unused action sets, and am down to 20 minutes which I think is realistic given the strength of our server.


                However...  every time I change the WHERE statement in my sql and select ok, the query runs.  Then when I refresh extract it runs again.  Is it possible that I could put in a filter for the extract other than in my custom sql?


                Again, thanks so much for your help!

                • 5. Re: Extract getting slower on each refresh?
                  Russell Christopher

                  Sure. If you're confident users will only consume data via the (already filtered) extract, then your approach makes sense if you don't like your expensive query getting fired multiple times...You'll just need to be aware that if someone downloads your workbook and turns off the extract, they'll be executing a monster (unfiltered) query against SQL.

                  • 6. Re: Extract getting slower on each refresh?
                    Kelly Martin

                    Grrrr.  That won't work at all and I'm sure I'd be run out of town by pitch fork carrying IT folk.  I guess this is as good as it gets until we purchase Server and start managing our data sources and extracts more efficiently.  Thanks again for all your help on this.

                    • 7. Re: Extract getting slower on each refresh?
                      Russell Christopher

                      Buy Server! Daddy needs a new pair of shoes!


                      I think keeping your WHERE clause on the Custom SQL is probably your best bet for the time being - it'll cost you a bit more time when you create the exact, but it'll be safer in terms of not inadvertently issuing a "killer query".


                      Now one thing that you COULD do is this:


                      1. Create your filtered extract using the technique you mentioned earlier.
                      2. Save a copy of the workbook - the copy will be the version your users utilize. I'll call this workbook "Users" from now on.
                      3. In a different instance of Tableau, create a new data source using the extract you created in step 1. Make sure you give the data source the exact same name as the data source that exists the "Users" workbook.
                      4. Open the "Users" workbook then drag-and-drop the data source you created in step 3 into it.
                      5. Tableau will detect that the new data source has the same name as the existing data source and ask if you want to replace it.
                      6. Replace it!


                      Your "Users" workbook now ONLY can use the extract to do it's work. Totally safe and disconnected from your data server. The drawback, of course, is that YOU will need to repeat steps 1-6 every time you want to update data for your users.

                      • 8. Re: Extract getting slower on each refresh?
                        Kelly Martin

                        Thanks again for your suggestions Russell. The powers that be have said we will be getting Server in the Spring -ish.  I sure hope so, because with every new report comes the request for more and more. 


                        I should mention that this Interworks analyzer has proven to be a great help in directing me to the views that have been causing problems with calculated fields/parameters.  https://www.interworks.com/services/business-intelligence/tableau-performance-analyzer



                        • 9. Re: Extract getting slower on each refresh?
                          Kelly Martin

                          Hey Russell!  I came across one of your posts today that solved my extract slowness problem!  http://tableaulove.tumblr.com/post/18945358848/how-to-publish-an-unpopulated-tableau-extract


                          I use two parameters - start and end date and then a calculated field that keeps only data between that range.


                          Now the extract is filtered based on the calculated field. 



                          My refresh time has reduced considerably!  Thank you for all your help.