Well, I'll tell you what we did.
- Tableau Server (duh)
- A backup Tableau Server instance
- Shareable Data Sources for Tableau Server (plus some extra PostgreSQL data from the Repository database)
- Interworks' Enterprise Deployment Tool
- Tableau Server Client (Python)
I was sipping my morning coffee on the last day of the quarter. It was about 8am, and the day was starting out fairly nicely (we have good coffee here at Tableau). All hands were on deck just in case any issues should spring up. But I didn't expect any to. Boring is good, sometimes.
Suddenly, as I was walking down the hallway after getting cup #2 of delicious, wonderful coffee, a VIP several layers above me caught me in the hallway in a panic: "So I think I just accidentally deleted a project from Tableau Server". So much for boring! We walked back to my office and watched the count of workbooks and data sources tick down from several hundred, with tens disappearing every second.
"Is there any way to stop it?", the VIP asked.
I replied, "Nope. And if we shut down Server right now or try to kill processes, the best we'll do is probably cause some database corruption". Clearly, we weren't going to fix anything by watching content disappear, so we set about making contingency plans until we figured out the best way to restore the content.
We quickly decided that we couldn't take our production server down for multiple hours to restore the backup from the night before--it was end of quarter, and doing so would heavily impact our sales and operations teams. Fortunately, at Tableau, we run a backup server that we restore our nightly production backups to right after they're done being taken. This server is kept on all the time, with the extracts refreshing just as often as the Production server's do. We maintain this server primarily because we upgrade our production server so frequently, and being 24/7 business, we don't want to interrupt anyone's work if we can help it. For this issue, this backup server was very useful to us in three ways.
The first is that it gave us a place to *immediately* point users who normally access content that lived in the deleted project. The URLs are exactly the same for all their workbooks and data sources, since it's based off of a restored backup. So instead of https://server/#/views/workbook/view , the user simply needed to replace the backup server name: https://server-b/#/views/someworkbook/someview. Pretty painless. So we used a data source we built from the Tableau Server repository database called TS Events to identify anyone who'd accessed the now-missing content within the last 30 days, and emailed those users to let them know how to use that workaround to get to it.
The second way that the information came in handy was that, since the content was still on the backup server, we could quickly download and re-publish it to the production server again without having to wait to restore a backup to some other server we spun up. But with hundreds of workbooks, and a good many published data sources, it would take forever to do manually. Sure, tabcmd could do the job of automatically moving all that stuff over, but it would take manual scripting, we'd lose permissions, and we'd end up publishing *all* views in a workbook to the production server, where not all of them should be published. Enter the Interworks Enterprise Deployment Tool. This tool made mincemeat out of first transferring all our published data sources from our backup server back to production, then our workbooks, but also brought over permissions perfectly without a bit of a fuss.
The final way the backup server came in handy is that it retained all the metadata lost in the deletion of the production server content. The Interworks EDT tool brought the content over with permissions, but was not able to bring over subscriptions, extract refresh schedules, or customized views. What's more, all the content migrated back to production did not retain its original owner--the service account we used to do the transfer was now the owner. All that metadata however, was still present on our backup server. We used another data source constructed from the Tableau Server repository database called TS Content, and used it to manually re-schedule extracts and assess who should be the correct owners of the items.
For the ownership changes, we were able to use a script using the Tableau Server Client (a Python-based wrapper for the Server REST API) to make the corrections automatically. Took a little figuring out, and my Python is pretty hacky in this script (attached), but it worked. We could not restore Subscriptions automatically, but we were able to pull information on who had subscribed to what view / workbook on what schedule from the Repository, and at least inform users of what they needed to reconstruct (we don't have a pre-built data source for this, sorry). We did not restore customized views or report on them, as that wasn't deemed as critical and would have been more effort than we wanted to invest in at that point.
As for our VIP, who felt really bad about the whole thing and shall remain anonymous, I revoked their admin privileges (at their hearty request, actually), but couldn't resist future-proofing their workstation for them
PS: Actually, no. Not quite the end. The transfer of content with the EDT tool and the ownership changes did require some finagling in some cases where credentials were embedded in connections. In those cases, users had to update them by editing the connections, and in a few rare cases, had to re-publish their content.
PPS: If you're wondering why this person was an admin to begin with--they actually weren't supposed to be. We found a bug in a script we use to revoke admin rights based on Active Directory groups that left their rights intact.
restore_ownership.py.zip 1.1 KB