Is your goal trying to point the workbook to the extract published on the server so that the workbook gets refreshed? Did i get it right?
I'm assuming that your local extract and the one on the server contains the same fields?
- I would connect to the source, Define any calculations if need to, create an extract and publish the extract to the server.
- Now create the dashboard with the local copy.
- Connect to the data source that was published on the server.
- Goto data replace datasource and point to the one on the server.
- Close the local copy from the workbook. Now the workbook is connected to the server datasource.
- Publish the workbook
- Schedule the datasource on the server.
Hope this helps.
1 of 1 people found this helpful
The extract has to be generated somewhere. There are a couple of different ways I can think to improve things:
1. Generate the extract on the same machine you're running Tableau Desktop on. But rather than using a laptop, run Tableau Desktop from a terminal server running in the datacenter, with a decent amount of horsepower behind it. You can make your changes, let the extract refresh, then publish up to Server from there with no downtime to users who rely on the data, and no dependency on your laptop.
2. Build a new Server instance as a staging environment (this may qualify as non-Production for licensing purposes--but that's no promise!). Use your current method to publish to this with the empty extract. Refresh the extract on the staging instance. Download it again, and republish to the Production Tableau Server instance. No downtime, and extract refreshing is still offloaded. Alternatively, you can write tabcmd scripts that auto-publish the datasource / workbook from Staging to Production, or find an application to do that for you.
Also, i forgot to mention, if you don't want to create a large extract on your local machine, you can use this nice little trick by Russell Christopher,
Thanks all for your time and your replies.
Vikram: I cannot apply the first step of your solution, I don't want to generate the extract locally and then transfer it.
The second solution you propose is the same I already implement, the only difference is that my empty extract is generated based on another kind of boolean flag, not related to the now() date.
The problem is: the unpopulated extract you suggest to create then has to be populated, and meanwhile the user will not see the data.
Matthew: your solution (1) looks interesting, I will have to ask for more permission to my admin (so far I am only Site admin, I cannot admin the server and I guess I cannot run the Desktop on the remote machine, I have to check.
This seems a rather complicated approach. Assuming that the Server is already refreshing the Extract correctly -- and assuming the Extract is a Published Data Source from which a Workbook is built -- why not just have the Server refresh the Extract and then after that is done, have the Server refresh the Workbook?
Is your approach based on a desire to fire off the refresh dynamically / on demand? Or are there manual edits that have to take place in the Workbook in order to accommodate updated data?
Sorry for all the questions -- it helps to understand the objective and the challenges, when trying to devise a solution.
Ken Patton all descends from the link Vikram posted: I'm following that approach, publishing an empty extract and then filling it on the Server via scheduled task. The server refreshes the extract correctly every night, the only issue is when I have to change something in the data source (i.e. a calculated field) and then I have to republish.
If I want to avoid publishing 1 GB data, I choose this approach of the unpopulated extract, but then it generates the empty dashboard issue. It was fine until I was working on the dev environment, but now that I've gone live I can't do this anymore :-(
Maybe the thing I did not explain clearly - I say it once again: it is a transitory issue, it lasts just the time needed for the server to refresh the extract.
So another question arises: when I ask to refresh an extract that is used by the workbook, how does Tableau Server internally works? Does it create a new temporary extract and then when it's done it replaces the old one with the new one? (I would expect this behaviour in such situation...) Or it immediately deletes the old extract to generate the new?
Yes, the Backgrounder writes out a temporary file, then "cuts over" to the completed version when the Refresh succeeds.
The empty-extract hack is extremely clever, as are many of Russell Christopher's ideas. But to some extent, it really is a hack, not necessarily designed to scale up to arbitrarily huge Extracts.
(Speaking of that: my experience is that can be difficult to produce high-performance workbooks from data sets that large, but that's a whole different discussion.)
On my production environment, we have extracts in excess of 1.5GB, with plenty of workbooks pointing to them, that Refresh at nearly zero "downtime" to the users. But we don't make many changes to the meta-data (the workbook that defines the Published Data Source), and when we do, we suck it up and re-Publish, which is also a near-zero disruption to the users.
Ok Ken Patton so I should simply stick to the idea of republishing the whole extract when I need to.
Sorry but I'm still not sure of what's the best in my situation due to the several combinations of way to achieve the same result.
For example, I have another set of dashboards, with smaller extract (20MB), that I easily publish every time as a whole bundle (.twbx file). To be clear: when I open it on the Desktop, I see the symbol of the 2 cylinder, so I'm not connecting to a remote server extract but to the bundled extract...and then I refresh the whole workbook+extract with a scheduled task.
I would like to avoid the same approach for my other, bigger workbook.
Sorry If this discussion has become so long