When you are completely finished with the creation of your Tableau Workbooks, try creating an extract with the 'Hide Unused Fields' option enabled.
This should increase the performance of your entire workbooks.
1 of 1 people found this helpful
Putting SQL server on Tableau Server to test performance may not provide the results you are expecting. SQL server can have very detrimental affects on the performance of Tableau Server and you may have solved one problem while simultaneously being impacted by another one. A test to better isolate the issue would be to use Tableau Desktop on the computer running Tableau Server and timing the refresh speed of there. This removes the Tableau Server from the equation and focuses only on the computer refreshing the data. If the problem disappears, the issue is likely within Tableau Server application (like a configuration issue). However, if the extract still takes a very long time to refresh in Tableau Desktop, the issue can likely be isolated to the environment where Tableau is running (computer, network, etc.). This is likely one of the first tests Tableau Technical Support will have you perform as well. If the issue is isolated to Tableau, a Performance Recording may help further isolate where the extra time is being spent: Create a Performance Recording. You can open a case with Tableau Technical Support here for further assistance: http://www.tableau.com/support/request
Thanks and hope this helps!
I'm not sure if it's related or not, but there is an issue related to slowness listed in the known issues link. Maybe 10.1.2 will be out soon and will alleviate any doubt, just a guess....
So few more information i would like to add here
1. Tableau and SQL Server were putted on same machine to test if shared memory protocol improve the performance. and we seen improvement as well.
2. regarding Removing unused column, extract keep only required column and this datasource will be going to be used by multiple dashboard.
3. we checked on infrastructure level, didnt find any such issue, still tried by increasing CPU and replacing disks with SSD .
4, data size will be going to be 20 GB approx which we have to put in extract , from sql server we can manage it under 4 minutes . so extract shouldnt go beyond 8 minutes
20 GB approx which we have to put in extract , from sql server we can manage it under 4 minutes . so extract shouldnt go beyond 8 minutes
You seem to be under the impression (or hoping) that Tableau should be able to create (process and write) an extract at the same speed one can move data OUT (read) of SQL Server. This is incorrect. There is a big difference between dumping raw data into a temp table and turning it into a compressed, columnar database - much more work is going on in the latter case.
Based on what you said, your extract of 25M rows is taking 1100 seconds, or ~18 minutes. An old (not super-accurate but good enough) rule of thumb is that if the data is fairly close by and you are extracting a reasonable number of columns, you will see something in the realm of 1M rows processed a minute. That's about what you're seeing here (a little bit faster).
So this doesn't seem slow in terms of what I'd expect to see. Why is 18 minutes "too slow" for you? Is there a specific reason you need it to be faster? Or, do you want to be faster "just because".
the reason why i need it more fast is , this is just 1/12th of the data of total data which will be going in this extract.
which we need to be finished in one hour , as we can afford maximum 2 hr refresh time delay for our Report site which many clients are using.
and we are ready to change hardware as well to achieve same, but that doesnt seems to be working
1 of 1 people found this helpful
I'm guessing you are not generating 300M (12 * 25M) new rows every two hours. You should consider doing an incremental refresh which merely grabs "new" rows and adds them to what you already have.
Your hardware is probably good enough (although more RAM might speed up the process of sorting a bit). As I mentioned previously, 1M rows/minute is a basic rule of thumb in terms of what to expect. You want to cut that in more than in half, which is a bit of a stretch.
Thanks Russell for your time, i really appreciate your will to help me.
in actual , their is a row level security implementation. So based on logged in user a filter on datasource get applied .
So we need all users in our extract. and with incremental extract it work fine as well.
problem comes when we have to add new user or when we update old data . for those cases we dont have any other option than running full extract.
RAM utilization are not going more than 50% .
however CPU utilization goes upto 80% during extract.
Is there an option to change the data architecture to accommodate row level security? We previously ran into a similar challenge and then came up with an alternate solution to avoid having to perform a full refresh of the primary data.
Primary data: Fact data, has region id
Secondary data blended on the region id: Key of region id, account manager. Also has a calculation named user filter that compares the account manager with the logged in userid. If they match, then 1 is assigned. "IIF(CONTAINS(LOWER([Account_manager]), LOWER(username())),1,0)"
All dashboards that require row level security have a low overhead blend using a user filter of 1
Hi Jeff, thanks for the suggestion , yes i can change the arch. and i liked the idea of using data blending for row level security . atleast with that implementation we can reduce the need of full refresh.
apart from row level security if we update older data , which is also a probability and we cant identify what all field can change in future. so we still need capability to run full extract and less time.
yeah, full refreshes are still probable on occasion, but at least the need to do it on a daily basis can be eliminated...
Try to use Extract SDK (extract API to create extracts outside Tableau and server API to send the extracts to server) to speed it up. Java version of extract API handles parallelism better than Python based on our test so Java extract API is your better option.
1M row/min is the benchmark I use as well. My understanding is that Tableau extract processing has some kind of 'package size limit' (not the right term, I know) which constraints its speed. When you use extract API, you have better control of the extract process by your own code and you can run your extract API in much powerful machines that do not need Tableau server core license.
hi Mark. Are you able to share an example of your Java code, I have some Python code that I tested a while back, but it didn't really speed anything up. I'm willing to do some deeper testing though with Java.