8 Replies Latest reply on Nov 17, 2015 10:39 PM by Alex Katona

    Best Practices for Data Usage (published data sources, etc)?

    Kyle B

      I have been using Tableau for several years now, but sometimes I get the feeling I'm doing something "wrong."

       

      I'm interested if there are any defined best practices, or workflows you guys have come up with in regards to Tableau and Tableau Server Data Sources. Let me describe a scenario to better illustrate my question.

       

      My typical workflow:

      I create a workbook

      I connect to a data source

      I set up an extract

      I create the views

      I publish the workbook

       

      At some point I get a request to change some minor thing, or I want to exclude a couple of outliers that are skewing the view (for everyone, not just myself). I open the workbook back up on my desktop, but the data is from when I last published it. So I refresh the extracts (could take some time, depending on size). I then re-publish, overwriting the previous workbook.

       

      Do most people operate the same way? Or once you determine you are going to publish a workbook, do you publish the data source, then connect to the published data source?

        • 1. Re: Best Practices for Data Usage (published data sources, etc)?
          Kyle B

          Wow, really? No one has ANY input?

          • 2. Re: Best Practices for Data Usage (published data sources, etc)?
            Allan Walker

            Hi Kyle:

             

            So here's my tuppence worth:

             

            1.  Elicit the user requirements: scope, UI/UX (tables, charts, hybrid, dashboard)

            2.  Derive the system requirements: color, look for best practise (usually Tufte), refine

            3.  Build up a list of workbooks per scope.  Attempt to bring all of these into one looking for a common set of parameters

            4.  Build up the most efficient custom query, usually custom SQL connection, looking for a common unit of analysis (for joining etc)

            5.  Connect and begin to design.  I work off the live connection, leaving pulling the extract until last.

            6.  Work out parameters, filters, CF's etc

            7.  Dashboard design.

            8.  Publish, get feedback, revisit 1.  Iteration.

            • 3. Re: Best Practices for Data Usage (published data sources, etc)?
              kettan

              Thank you for creating and pinging this question! I have thought to write this question myself and happy that you already did. Unfortunately, I have no satisfying answer.

               

              Difficulties

              I do approximately the same as you describe and therefore probably meet the same time wasting issues:

              1. It is difficult to maintain data sources
              2. It is difficult to maintain an overview of the "content" of data sources, such as sql, columns, and tables
              3. It is difficult to maintain an overview of which workbook.worksheet uses which data extract and which columns
              4. It is difficult to maintain 1-2-3 and therefore difficult to execute new requests


              Situation

              Luckily I am in an early and experimental period with Tableau with few requests implemented.

               

              Next step

              I will take a close look at the new data extract API. My hope is:

                It will be easier to maintain and document the data part, that is #1 and #2 in the list above

                Issue #3 I have to document manually

               

              More inputs, please

              I hope more will share their input. This is definitely something I need to know more about, and I am quite sure it is true for others too.

              • 4. Re: Best Practices for Data Usage (published data sources, etc)?
                Sean Mullane

                I'll add my input as well, for what it's worth. My workflow tends to be the same as Kyle's, and my difficulties the same as Johan's. Right now most of my workbooks use data extracted from Salesforce via Excel. This makes even refreshing the data in workbooks a fairly manual process. This has allowed me to keep track somewhat of the issues brought up here at the expense of my time.

                 

                Since 8 is out I'm going to move these workbooks over to the salesforce.com connector and try to automate the refreshes. No idea yet what the best practice will be for that. My first thought is to define broadly applicable data sources and publish them to the server, then build individual workbooks off of these connections to minimize the API call usage. I'm a little worried about workbooks breaking if changes are made in our Salesforce environment or fields without me knowing about them. Aside from checking the workbooks myself periodically or having end users point out problems (not good) I don't see a way to keep up on these besides making sure I stay in the loop with our admins.

                 

                It would be great to have a way to tell the server to tell me when a new field is added/removed/changed, or if new data items start coming in to a field that previously had never held those particular data items.

                • 5. Re: Best Practices for Data Usage (published data sources, etc)?
                  Allan Walker

                  Sean,

                   

                  You bring up great point - while we (all) share a process, there is a step that I think Tableau could help us all with - version control.  I don't use server (yet, moving over soon), I use Public.  So configuration management for me is basically a temp table/Excel spreadsheet where I status all my workbooks with extract dates and times etc.

                   

                  I try to have a consistent naming convention - Title, tableau version, draft/reviewed/accepted/released etc.  I try to keep this inline with ISO 9001 best practice.

                   

                  Best Regards,

                   

                  Allan

                  • 6. Re: Best Practices for Data Usage (published data sources, etc)?
                    Sean Mullane

                    Yes, exactly. Native version control would be excellent. Right now we have few workbooks and few users, so I can maintain the workbooks' integrity by being a little iron-fisted and working with them frequently. I'd love to know how some larger installations are being run with regards to version control/data integrity.

                    • 7. Re: Best Practices for Data Usage (published data sources, etc)?
                      William Peterson

                      My basic workflow, since it it so different from OP (ignoring most of the complexity of dealing with Dev/Qual/Test/Live Databases at different versions & Similar workbooks on different servers and sites)

                       

                      1. Develop the datasource in an empty workbook, using a small extract, periodically publishing to the dev server/site.

                      2. In parallel, develop/update a dashboard workbook connecting to the published tableau server data source.

                      3. repeat until happy.   Increase extract size & test incremental refresh.

                      4. Pull datasource back down (make local copy) into a master datasource list workbook that contains no worksheets or dashboards

                      5. Edit datasource to point at the correct server.  Create final extract & incremental extract rules.

                      6. Rename it, Log Out/Log In, Publish to production server.

                      7. In workbook, make local ds copy & change to it, log off, log into production server, add production ds, change to it, close local copy, test.

                      8. Publish Workbook.

                       

                      This whole process should be much, much easier.  In particular, the fact that you can only be logged into one tableau server/site at a time is a nightmare.  the server/site scope should be the datasource, not the whole workbook.

                       

                      The key thing here is that you keep your datasource development separate from workbook development, and keep all your datasources in one place.

                      • 8. Re: Best Practices for Data Usage (published data sources, etc)?
                        Alex Katona

                        Hi everyone,

                         

                        I developed a method using embedded extracts to more easily edit published data sources that doesn’t require downloading the data source to your local machine from Tableau Server. Here is the link to the tutorial on my blog:

                         

                        http://alexkatona.blogspot.com/2015/11/using-embedded-extracts-to-edit.html

                         

                        Alex