8 Replies Latest reply on Nov 21, 2014 4:11 AM by vincenzo.deconcilio

    Best practice for workbook publication

    vincenzo.deconcilio

      Hi,

      I'm developing a workbook that uses a server-generated extract.

      So far I have followed these steps in order to publish it (the steps :

       

      1. Define a Boolean parameter ExtractEnabled, and a CanExtract calculated field based on it

      2. Change the ExtractEnabled parameter to the value "False"

      3. Create the local extract by adding the condition "CanExtract = True"; it will generate an empty set of data

      4. Edit again the ExtractEnabled parameter, this time to True

      5. Publish the local data source on the server

      6. Log on the server and start a scheduled task to refresh the remote extract

      7. In the workbook, connect to the remote data source

      8. right click and replace the data source, so that the workbook uses the remote instead of the local one

      9. publish the workbook on the server

       

      My first concern is that this does not seem a smooth process. I have just 6 months experience with Tableau but I think I can deal with a fair amount of issues, my concern is if I had to handover this to a fellow developer in my team: even if I document the process above, it is highly error-prone in my opinion.

       

      But the worse thing I discovered just few weeks ago while going live: after the point (5) and before point (6) (extract generation) completes, the remote data source will become empty, the dashboard will appear as blank, so there will be a downtime for the users!

       

      Is there any way to avoid this downtime, doing things in a different way?

      Publishing the full packaged source looks unfeasible, too big and slow.

      If it were possible to rename the workbooks straight on the server, then I could create a copy (hidden?) then replace it when the remote extract is ready...

      Any alternative?

       

      Thanks

        • 1. Re: Best practice for workbook publication
          vikram bandarupalli

          Hi Vincenzo,

           

          Is your goal trying to point the workbook to the extract published on the server so that the workbook gets refreshed? Did i get it right?

           

          I'm assuming that your local extract and the one on the server contains the same fields?

           

          - I would connect to the source, Define any calculations if need to, create an extract and publish the extract to the server.

          - Now create the dashboard with the local copy.

          - Connect to the data source that was published on the server.

          - Goto data replace datasource and point to the one on the server.

          - Close the local copy from the workbook. Now the workbook is connected to the server datasource.

          - Publish the workbook

          - Schedule the datasource on the server.

           

          Hope this helps.

           

          Thanks,

          Vikram

          • 2. Re: Best practice for workbook publication
            Matt Coles

            The extract has to be generated somewhere. There are a couple of different ways I can think to improve things:

             

            1. Generate the extract on the same machine you're running Tableau Desktop on. But rather than using a laptop, run Tableau Desktop from a terminal server running in the datacenter, with a decent amount of horsepower behind it. You can make your changes, let the extract refresh, then publish up to Server from there with no downtime to users who rely on the data, and no dependency on your laptop.

             

            2. Build a new Server instance as a staging environment (this may qualify as non-Production for licensing purposes--but that's no promise!). Use your current method to publish to this with the empty extract. Refresh the extract on the staging instance. Download it again, and republish to the Production Tableau Server instance. No downtime, and extract refreshing is still offloaded. Alternatively, you can write tabcmd scripts that auto-publish the datasource / workbook from Staging to Production, or find an application to do that for you.

            1 of 1 people found this helpful
            • 3. Re: Best practice for workbook publication
              vikram bandarupalli

              Also, i forgot to mention, if you don't want to create a large extract on your local machine, you can use this nice little trick by Russell Christopher,

               

              How to publish an unpopulated #Tableau extract. | Tableau Love

               

              Vikram

              • 4. Re: Best practice for workbook publication
                vincenzo.deconcilio

                Thanks all for your time and your replies.

                 

                Vikram: I cannot apply the first step of your solution, I don't want to generate the extract locally and then transfer it.

                The second solution you propose is the same I already implement, the only difference is that my empty extract is generated based on another kind of boolean flag, not related to the now() date.

                The problem is: the unpopulated extract you suggest to create then has to be populated, and meanwhile the user will not see the data.

                 

                Matthew: your solution (1) looks interesting, I will have to ask for more permission to my admin (so far I am only Site admin, I cannot admin the server and I guess I cannot run the Desktop on the remote machine, I have to check.

                • 5. Re: Best practice for workbook publication
                  Ken Patton

                  This seems a rather complicated approach. Assuming that the Server is already refreshing the Extract correctly -- and assuming the Extract is a Published Data Source from which a Workbook is built -- why not just have the Server refresh the Extract and then after that is done, have the Server refresh the Workbook?

                   

                  Is your approach based on a desire to fire off the refresh dynamically / on demand?  Or are there manual edits that have to take place in the Workbook in order to accommodate updated data?

                   

                  Sorry for all the questions -- it helps to understand the objective and the challenges, when trying to devise a solution.

                  • 6. Re: Best practice for workbook publication
                    vincenzo.deconcilio

                    Ken Patton all descends from the link Vikram posted: I'm following that approach, publishing an empty extract and then filling it on the Server via scheduled task. The server refreshes the extract correctly every night, the only issue is when I have to change something in the data source (i.e. a calculated field) and then I have to republish.

                    If I want to avoid publishing 1 GB data, I choose this approach of the unpopulated extract, but then it generates the empty dashboard issue. It was fine until I was working on the dev environment, but now that I've gone live I can't do this anymore :-(

                    Maybe the thing I did not explain clearly - I say it once again: it is a transitory issue, it lasts just the time needed for the server to refresh the extract.

                     

                    So another question arises: when I ask to refresh an extract that is used by the workbook, how does Tableau Server internally works? Does it create a new temporary extract and then when it's done it replaces the old one with the new one? (I would expect this behaviour in such situation...) Or it immediately deletes the old extract to generate the new?

                     

                    Thanks again

                    • 7. Re: Best practice for workbook publication
                      Ken Patton

                      Yes, the Backgrounder writes out a temporary file, then "cuts over" to the completed version when the Refresh succeeds.

                       

                      The empty-extract hack is extremely clever, as are many of Russell Christopher's ideas. But to some extent, it really is a hack, not necessarily designed to scale up to arbitrarily huge Extracts.

                       

                      (Speaking of that: my experience is that can be difficult to produce high-performance workbooks from data sets that large, but that's a whole different discussion.)

                       

                      On my production environment, we have extracts in excess of 1.5GB, with plenty of workbooks pointing to them, that Refresh at nearly zero "downtime" to the users. But we don't make many changes to the meta-data (the workbook that defines the Published Data Source), and when we do, we suck it up and re-Publish, which is also a near-zero disruption to the users.

                      • 8. Re: Best practice for workbook publication
                        vincenzo.deconcilio

                        Ok Ken Patton so I should simply stick to the idea of republishing the whole extract when I need to.

                        Sorry but I'm still not sure of what's the best in my situation due to the several combinations of way to achieve the same result.

                        For example, I have another set of dashboards, with smaller extract (20MB), that I easily publish every time as a whole bundle (.twbx file). To be clear: when I open it on the Desktop, I see the symbol of the 2 cylinder, so I'm not connecting to a remote server extract but to the bundled extract...and then I refresh the whole workbook+extract with a scheduled task.

                         

                        I would like to avoid the same approach for my other, bigger workbook.

                         

                        Sorry If this discussion has become so long