Alternatively, ask those few extract owners to re-look at their extracts to improve extract efficiency. A few areas to look at:
1. Change to combination of incremental and full extracts. For example, if it is full extract daily today, change to daily incremental and weekly full. If incremental extract pulls 'updated' records as dup in extracts, use cal to remove the dup. The weekly full extract will delete all unnecessary records. Set high priority policy (like 30) for all incremental extracts to give some incentives for all incentives.
2. Move extracts outside of Tableau by using SDK which has extract API to pull extract from data sources, and server API which sends extracts to Tableau server. Either Python or Java works but Java SDK is more effective to process 20M+ rows of data due to its better parallel processing - this was based on my 9.2 test not sure v10 had any changes.
3. Hide unused columns: Tableau will determine which columns are used in your workbook at extract-publication time and hide all the others if selected
4. Aggregate to the highest visible level of your views: If your sales chart shows monthly sales by region, the select this option to sum up sales activity by month and region.
5. Filter your data: When you’re doing a year over year comparison, for example, filter out any historical data that won’t be used in the view.
Here's an idea to optimize extract refreshes by manipulating background task priorities:
Extract refreshes and subscription schedules are executed in the following order:
- All tasks currently in process will complete first.
- Tasks with the highest priority (lowest number) will be taken next, regardless of how long they have been waiting. For example, a task with a priority of 49 will be executed before a task with a priority of 50, even if the task with a priority of 50 has been waiting longer.
- If all tasks have the same priority, tasks will be executed in the order they were queued; the task scheduled with the earliest time stamp will be executed first.
- When multiple tasks with the same priority are scheduled to run at the same time, they will be executed in the following order:
- All extract refreshes in the order that they were created or enabled.
- All email subscriptions in the order they were created or enabled.
- Tableau Server can only run as many tasks concurrently as there are backgrounder processes configured in that Tableau Server environment.
- Separate extract refreshes for the same data cannot run simultaneously.
Note: This list only covers extract refreshes and subscription schedules, and does not consider other tasks, such as reap extracts.
Imagine every refresh has the default priority of 50 (#3 above). Here’s what the queue may look like:
Hourly refresh job1 Priority 50
Hourly refresh job2 Priority 50
Daily refresh job3 priority 50
Daily refresh job4 priority 50
Weekly refresh job4 priority 50
Daily refresh job6 priority 50
Daily refresh job7 priority 50
Monthly refresh job8 priority 50
Daily refresh job9 priority 50
Daily refresh job10 priority 50
Hourly refresh job11 priority 50
So it may be more than an hour before backgrounder can get from hourly refresh job2 to job11.
Generally it's better to have most frequently refreshed extracts assigned the highest priority, .e.g. 15 min refresh = priority 1, hourly = priority 5, daily = priority 10, etc.