1 of 1 people found this helpful
Hi Kelly, I recommend sending your logs and workbook in to Support. They should be able to find out where Tableau is spending it's time.
If you are feeling adventurous you could look into log.txt, tabprotosrv.txt and tdeserver.txt and use the timestamps in the files to search for any large gaps. Support has tools to do this automatically and they could help decipher the log statements to see what Tableau is doing.
You might also try creating a new workbook with the same connection/SQL and see how long that extract takes.
How many calculated fields do you have? If you had hundreds it might affect the extract speed, otherwise it shouldn't be a problem. If you are using complicated if/then or case statements with string columns, for example to do formatting, these may take a long time to optimize when doing the extract.
I think that using the where clause in your custom SQL is the right way to go. I wouldn't expect a filter on the extract to be significantly faster or slower.
The extract is done per datasource so I don't think that should affect things.
Thank you so much Logan. I don't have hundreds of calculated fields; maybe 20 and same for parameters. I'll try creating the new workbook and if there's still an issue, I'll contact support. I appreciate your feedback as I hate to go to support first when the problem is likely my lack of expertise.
1 of 1 people found this helpful
Hey Kelly --
Where possible, Tableau actually pre-calculates and "saves" the results of your new calculations inside the extract (we call this "optimization") . It sometimes is more efficient to have those values pre-calculated and ready to consume vs. "thinking" in real time as a report is being rendered.
Ever noticed the "Optimize" option in the extract menu? "Optimize" means "Tableau, please go ahead and calculate the results for all my calculations and save them in the extract just like the 'real' fields from my data source".
If I had to guess, as you add new Calculated Expressions to your workbook (and extract), we're dutifully adding columns of results to the extract itself....so they're getting bigger. Optimization does take extra time, but it's hard to say whether it's completely responsible for your longer extract time. It'd be interesting to test your extract time by removing 30% / 60% / 100% of your calculated expressions and see how perf improves.
Let us know what happens!
Thanks for your help Russell - there we a number of unused calculated fields that once removed did improve things considerably. I have been using the Optimize option regularly as it seems I am constantly making new calculations (investigating a brand new data source). I did contact the Tableau support folks and they gave me a number of recommendations as well. I removed a lot of unused calculations, a set, a bunch of unused action sets, and am down to 20 minutes which I think is realistic given the strength of our server.
However... every time I change the WHERE statement in my sql and select ok, the query runs. Then when I refresh extract it runs again. Is it possible that I could put in a filter for the extract other than in my custom sql?
Again, thanks so much for your help!
Sure. If you're confident users will only consume data via the (already filtered) extract, then your approach makes sense if you don't like your expensive query getting fired multiple times...You'll just need to be aware that if someone downloads your workbook and turns off the extract, they'll be executing a monster (unfiltered) query against SQL.
Grrrr. That won't work at all and I'm sure I'd be run out of town by pitch fork carrying IT folk. I guess this is as good as it gets until we purchase Server and start managing our data sources and extracts more efficiently. Thanks again for all your help on this.
Buy Server! Daddy needs a new pair of shoes!
I think keeping your WHERE clause on the Custom SQL is probably your best bet for the time being - it'll cost you a bit more time when you create the exact, but it'll be safer in terms of not inadvertently issuing a "killer query".
Now one thing that you COULD do is this:
- Create your filtered extract using the technique you mentioned earlier.
- Save a copy of the workbook - the copy will be the version your users utilize. I'll call this workbook "Users" from now on.
- In a different instance of Tableau, create a new data source using the extract you created in step 1. Make sure you give the data source the exact same name as the data source that exists the "Users" workbook.
- Open the "Users" workbook then drag-and-drop the data source you created in step 3 into it.
- Tableau will detect that the new data source has the same name as the existing data source and ask if you want to replace it.
- Replace it!
Your "Users" workbook now ONLY can use the extract to do it's work. Totally safe and disconnected from your data server. The drawback, of course, is that YOU will need to repeat steps 1-6 every time you want to update data for your users.
Thanks again for your suggestions Russell. The powers that be have said we will be getting Server in the Spring -ish. I sure hope so, because with every new report comes the request for more and more.
I should mention that this Interworks analyzer has proven to be a great help in directing me to the views that have been causing problems with calculated fields/parameters. https://www.interworks.com/services/business-intelligence/tableau-performance-analyzer
Hey Russell! I came across one of your posts today that solved my extract slowness problem! http://tableaulove.tumblr.com/post/18945358848/how-to-publish-an-unpopulated-tableau-extract
I use two parameters - start and end date and then a calculated field that keeps only data between that range.
Now the extract is filtered based on the calculated field.
My refresh time has reduced considerably! Thank you for all your help.