As extracts refresh are unevenely distributed during the day (everybody wants their reports updated just before people arrive at the office in the morning ), there's unacceptably long queue time for refreshes during rush hours. Another problem is that we have users putting their reports on schedule refreshing extracts every 15 minutes (might be justified during launches) but then never remove them from the schedule. We had reports refreshing every 15 minutes that had not been opened in almost a year.
So I started to see what were the tools Tableau provided that could help me analyze and manage refresh schedules utilization.
- It seems impossible to accurately join the tables 'background_jobs', 'tasks' and 'schedules' to build a report that would show the number of tasks/schedule. When I say that I cannot do an 'accurate' join, an example of that would be that I have 1 workbook refreshing on a single schedule, yet in shows under 5 different 'correlation_id' 's in the table 'background_jobs'. Yet, only 1 of these 5 'correlation_id' 's exists in the table 'tasks'.
- I can simply look at the start times of extracts from the table background jobs and see at what time of the day we have too many extracts refresh running. After I can simply go in the browser and see which schedules are generating all these refreshes.
- It couldn't find a way to restrict refresh schedules access (i.e. letting only a group of users being able to see and use the schedule.) It seems that if I make a schedule with high priority for an important project, just anyone can use it.
- Correct me if I'm wrong, but it doesn't seem tabcmd allows me to remove a workbook/datasource from a refresh schedule
- Same for Tableau REST API. It seems to do everything except allowing me to remove a workbook/data source from a refresh schedule and add it to another one
- It seems there's nothing that would allow me to automate the management of refresh schedules and that the only way to manage them is through the browser.
After this assessment, it seems that the only way for me to get anything done is to spend almost a full day every now and then looking to look at background jobs and send emails to users to negociate whether their refreshes can be done daily instead of hourly (based on observe usage of the report) and ask them if their extract can can be refreshes at 9am instead of 6am.
How do you do it? Is there any 'intelligent' and efficient way of managing refresh schedules?
Any help would be appreciated, thanks!