1 of 1 people found this helpful
Hey Mark. In addition to the normal backup / restore, there is also exportsite and importsite Import or Export a Site that is fully supported by Tableau. Though, I'm not sure if this is what you're after; There are probably unsupported techniques such as using robocopy or xcopy to take what you want out of the Tableau install directory, but I'm not sure what to do with it in terms of restoring. Also, there is a command that I came across in another forum post to not include the extracts in the backup, but I don't know if it continues to work in the most recent releases, but here it is: tabadmin set service.backup.pgonly
The service.backup.pgonly setting is interesting....it sounds like you would need to do a Service stop/start every time you wanted to switch from "full" backup to database only though? Also curious how the restore would work with this option
3 of 3 people found this helpful
Note that that's only for uninstalls / upgrades, not standard backups. But we've used it at Tableau for some time now--it's been very helpful, since we upgrade Server every few weeks or so. Saves a lot of time! But know that this will be a feature in the 10.0 upgrade process, so there won't be much need for it in the future!
I don't know of any tricks to speed up a backup via hacky means, but Jeff's export site idea is worth considering. There are other approaches, though:
- Decrease the amount of data stored in extracts. You've probably already done this, but every week I find people refreshing extracts hourly when they only view the workbook every day or three, uploading test workbooks with 500 MB extracts, not hiding the unused fields. It's a lot to stay on top of, but it's really helped us scale.
- Establish a cleanup process. Find content that is old, unused, and remove it. Won't stop the continuing increase in size, but it helps slow the rate somewhat.
- Decrease extract refresh frequency. If you are refreshing extracts frequently, especially faster than every hour, it forces Server to wait longer during the backup process, since reaping of old extracts is disabled while backups run.
- I've heard but have not confirmed that running a Filestore process on the Primary host can speed up backup and restore times. I suppose this makes some sense, since the Primary doesn't need to copy as many files over to itself for zipping--it should possess all the necessary files locally (except perhaps the PostgreSQL primary, you'd still need to dump that and copy it over...but that's far less data). Also, if you're running a weak VM primary, using a physical one with better disks can help. Of course, if you're running a "headless" weak VM primary on a core license, you'd start consuming more of your license to move Filestore / DE on there...so that's a pretty big consideration.
- Consolidate workbook extracts into datasources backed by extracts. Promote them. Start pushing people to use them instead of generating their own extracts. Long, messy process, but worth it if you haven't.
- Consider storing your data in a analytics-oriented columnar datastore such as Redshift. This is lowest on the list for a reason, as it's probably the most work of all. But the fact is, extracts simply aren't good solutions for everything. They sure are convenient, and often function as a "go faster" button, but they can be siloed, preventing re-use, and cost a lot of processing power to keep fresh. Too many extracts of the same data may mean you need to invest in a faster data platform.
Edit: added another point or two
yep, we've used the pgonly setting for uninstalls / upgrades as well, but I thought I'd throw it out there as a potential for doing a backup and leaving out the data extracts and then perhaps having a separate script that does a non-Tableau zip of the dataengine folder, but this is all theory and I haven't actually done it myself, but it sounds doable.
I do like your suggestions of essentially rationalization of all the "clutter" out there.
A backup isn't much of a backup without the extracts, in my opinion. The PGonly option during uninstall was never so much about backup things up faster as it was just trying to skip the auto-backup during uninstall, because we always took one prior to the outage anyway. It's just a technique to reduce downtime. I would never want to try and take a PostregSQL dump file and restore it surgically into a Tableau Server instance...I am sure it's possible, but just thinking about in my admin-ey brain it feels like one of those nightmares where you suddenly fall off a cliff.
Matt Coles wrote:
I don't know of any tricks to speed up a backup via hacky means, but Jeff's export site idea is worth considering. There are other approaches, though...
Uhm...freakin' wow (but in a positive way)! So I can see a document dedicated to exactly how you do that especially on the first two points.
If you are using VM's in a distributed environment WITH a primary machine only running the Gateway and Search & Browse processes, I have heard unconfirmed rumor that you can use v-motion or vm snapshots on the workers only. Using v-motion or snapshots on the primary will break licensing. This is also unsupported by Tableau officially.
It would be nice to have someone test it as we have Tableau installed on bare metal boxes
2 of 2 people found this helpful
Hey Toby. On extract cleanup / resource reduction:
1. I don't have a published version of the workbook I use for identifying "wasted effort" extracts, not yet at least--it relies on a data blend between the TS Background Tasks that you recently downloaded, and one called TS Content that lists all Server content with aggregated access metrics--which I haven't published publicly yet. Basically, the workbook I use allows you to set an "efficiency level" as a parameter value, which is then compared to (number of accesses over the last 10 days) / (number of minutes spent refreshing the extract over the last 7 days). It filters out brand-new content, as it's likely it wouldn't yet have had time for users to adopt it. My parameter is set to .01, and I routinely find stuff that takes up 1,500 minutes of backgrounder time per week, but only one or two people are actually using. And yes, Dev is hearing about the need for a way to do this in the product!
2. See A workbook for identifying and removing unused workbooks and datasources . We do this quarterly. It generally removes 10% to 20% of the items on Server (not from a space measurement, just overall item count)
1. I like the efficiency level concept and that you're promoting it to Dev. If you have it in the form of an Idea I'll vote it up. This would be a good complement to "Server Disk Space" (I LOVE THIS VIZ, BTW!) in the Server >> Status report. Anyway, I should be doing something like this as well because I have a good number of authors who need guidance.
2. What you've done is basically what I've done using a VizAlert in my Workbook Aging. Same idea, different approaches.
The company I am part of just released a new version of Akiri Anytime which includes a very fast Tableau backup (called “TurboSync”). This is an alternative, though it is a commercial add-on to Tableau.
It is lightweight with very little impact on your Tableau Server, so continuous backups can be run all day, if needed.
In one typical example “tabadmin backup” took 1.5 hours, while TurboSync backup took 7 minutes.
Have you restored a Tableau backup made by TurboSync? Does it work?
1 of 1 people found this helpful
Yes, one of the capabilities of Akiri Anytime is to automatically restore
each Tableau Backup made by TurboSync to an additional disaster recovery
environment. This ensures the backup is always valid.
On Wed, Aug 3, 2016 at 9:39 AM, John Kuo <firstname.lastname@example.org>
Tim, does this create an actual Tableau backup (.tsbak) or does it do a complete different methodology?