6 Replies Latest reply on Aug 12, 2019 2:38 PM by Charlaine Blanton

    Best practices for using extracts of published data sources at Project vs Workbook level

    Aaron Cowper

      Hi

       

      I have been trawling through the docs and other forum topics but can't find a definitive set of best practices for using extracts effectively.

       

      Here is my current desired outcome for a particular project:

      - We have several SQL tables hosted in Azure that we save as published data sources for our workbooks

      - Within the project, we have 3 *almost* identical workbooks (same dashboard but different bus unit), for which we use the same data sources but apply a filter at the data source level (i.e Bus Unit = BU A, BU B, BU C)

      - The tables are relatively large (several million records), so we think using extracts is desirable over live connections

      - We are currently in dev/test mode, so are doing a lot of DB updates that require a refresh of dashboards (and hence extracts), both on Desktop (for our analysts who are building) and on Server (for the Users)

       

      I have noticed the following:

      - Once a data source is published, all Calculated Fields seem to be published along with it, so you can only 'Edit a Copy' of a field. Changing the field requires us to create a local copy of the data source, switch to that, edit and republish.

      - Hence I sometimes find working with a 'Live' data source is easier while we are in iteration / test mode. But If I publish a data source as 'Live', then I can never seem to be able to convert it to an 'Extract' at a later point in time. So I need to create a new data source for the same table in Desktop, create an extract, then do a 'Replace Data Source' and republish, which is annoying. Is there a better way?

      - Is it possible to publish a data source as 'Live' at the Project level in Server, but then for each individual workbook to use that same data source but take an extract? If so how is this done?

       

      Another related annoyance is that whenever you replace a data source with another data source with identical fields (eg a local copy of a published datasource), the colours of all attributes change in dashboards legends. Is there a way to stop this happening?

       

      Thanks!

        • 1. Re: Best practices for using extracts of published data sources at Project vs Workbook level
          Chris McClellan

          The short answer is that you're doing everything right and for most of it that's how Tableau works.  If the Azure database is fast enough, use live connections all the time but there's good reasons to use an extract so you're doing that right as well.

           

          To address a few items ....

           

          - changing the calculated fields after it's been published and having to download, change, upload again.  Yes, it feels like a long process but that's the benefit of a published data source.  You won't appreciate it in a dev/test scenario but you should appreciate it more when you productionise.  I've lived through this pain before, I've complained bitterly that it works this way, but not I'm in full support that this is the best way that it functions.

           

          - once a data source is published, you can't swap it (via Server) from live > extract or extract > live.   The only way to change it is to re-publish from Desktop

           

          - having a live data source, but the workbooks use the data as an extract ? I don't quite understand, but no this is not possible unless you have different data sources.

           

          - Replacing data sources and losing settings.  There is an Idea to "get it fixed", but the workaround which works MOST of the time (not guaranteed 100%)   is ....... 

           

          - open the workbook

          - click on every dashboard /sheet / story .  Basically click on every tab along the bottom navigation bar, WAIT for every element of the tab to load before continuing to the next one.  So GO SLOWLY, visually check that everything is loaded on each tab before continuing to the tab.  THEN, replace the data sources and it USUALLY works .... at least it works better than a straight open / replace - you will always have problems with that.

          1 of 1 people found this helpful
          • 2. Re: Best practices for using extracts of published data sources at Project vs Workbook level
            Charlaine Blanton

            Hi Chris -- Thanks for your response to Aaron.  I'm having a similar issue - I want to adjust the filters and fields I used in Desktop to create an extract that I then published to Server.  Problem -- I can't get back to my original data and settings to make edits.  It looks like I may have to create it from scratch each time, or else go through a two-step process of creating the extract with one Tableau Desktop file, then publishing the extract using a different Tableau Desktop file.  You seem to indicate in your response to Aaron that there is a "get it fixed" idea floating around out there.  If so, could you direct me to it?  I'd like to add my vote.  Many thanks.

            • 3. Re: Best practices for using extracts of published data sources at Project vs Workbook level
              Chris McClellan

              i did a search in the Ideas area and found this

               

              FAQ:  Replace Data Source

              1 of 1 people found this helpful
              • 5. Re: Best practices for using extracts of published data sources at Project vs Workbook level
                Michael Gillespie

                I'll add some thoughts to Chris' excellent post.

                 

                1) Separate the management and publishing of data sources from the management and publishing of workbooks.  By that, I mean have a separate workbook that only contains data sources, no vizzes other than those required to do some quick validation of row counts.  NEVER publish that workbook; ONLY publish the data sources that live in that workbook.  More on this later.

                2) Get in the habit of generating a TDS file for every single data source you create, before you publish it.  Create the connection, go to a blank worksheet page, save the workbook.  Then right click on the data source name and select "Add to saved data sources".  That will create a TDS file that contains all your connection information so that you can easily retrieve it later.

                3) When you build workbooks, as much as possible work from PUBLISHED data sources.  If you create a data source independently of your vizzes, and publish that data source before you've created a single viz or calculated field, then all the calculated fields you DO create in your viz workbook will stay in the workbook, and won't be published along with your datasources.  That makes it a lot easier to manage, until you've got to the point where you are happy with the calcs.  THEN you can copy them over to the data source workbook and republish the data source.  The calcs will now be attached to the data source, and when you open your viz workbook, they will replace the ones in that workbook with the published ones.  That keeps it in your control while you're building, but locks it in place for consumption.

                4) All of that works with LIVE or EXTRACT data sources.  I agree with Chris: start with live and see what happens.  But you do have to republish if you change to an extract.  The advantage of the process I've outlined above is that if you DO republish but keep the same name, you will NOT have to do a Replace data source: as far as Desktop is concerned, it's the same data source.

                 

                As for the stuff that breaks, I have found that each release of Tableau does a better job of not breaking things.  In 9/10 versions it was bad; now, it's the exception.

                 

                Charlaine, the TDS tip will help you, I think.  But I'm not completely understanding your issue.  Could you explain it in a little more detail?

                1 of 1 people found this helpful
                • 6. Re: Best practices for using extracts of published data sources at Project vs Workbook level
                  Charlaine Blanton

                  Thanks very much, Michael.  I'm experienced in Tableau Desktop but new to Tableau Server, so that's likely where much of the problem lies.  I'm finding it easier to design worksheets in Tableau Desktop than in Tableau Server.  And, I'm finding performance is better on Tableau Server if I create an extract using Desktop and then publish it (as opposed to creating an extract using Server).  So keeping those two points in mind, I've been creating an extract and worksheets in the same Tableau Desktop workbook, and publishing the extract source and the workbook separately (i.e., not packaged) from that workbook.  Everything works great until I have to tweak the extract.  If I try to do it from my original Desktop workbook, I find I'm connected to the server extract and not my original data source, and I don't see a way to get back to the original so that I may edit my filters and re-create my extract.  Based on your post, it sounds like I'm taking the wrong approach.  Will the TDS file help me get back to my original data source?  Thanks again for your help.