9 Replies Latest reply on Apr 29, 2019 7:54 AM by Jeff Strauss

    Metadata Automation

    Jeff Strauss











      If you found this post, then wonderful!!!.  I hope it helps on your journey to a fully discoverable Tableau Server.  If you know somebody that hasn't found it yet, then please pass along.



      Before I get started, I want to thank community leaders that inspired my team to go and create the below solution.  All of it helped lead us to the path that I will share today.




      The value goes like so.  Analysts create Insightful dashboards and datasources...Publish...Tableau Server has a mass of dashboards (hopefully neatly organized into projects), but now the need arises to peel back the onion, understand collectively the data lineage, and answer questions like:

      1. Our DW is replacing X with Y, how are all the dashboards affected?

      2. What vizzes have a particular data attribute so that I don't have to start from scratch?

      3. What is the underlying calculation and are calculations consistent across workbooks?

      4. What is the min and max value for a data dimension?

      5. Are there unused fields within published datasources?



      Sounds good.  So how does this solution work?  (open sourced at https://github.com/JeffStrauss18/TableauMetadataExtractor).

      (DISCLAIMER: What is on github is AS-IS, there is no guarantee, it will work with 10.1, if you want it to work with earlier versions, it will work with a few small adjustments).


      (DISCLAIMER: This is not an internal Tableau developed project.  So don't go asking support as it's not officially Tableau supported).


      - Create a Metadata Postgres staging instance and a Metadata Postgres prod instance.  All the definitions are available.  It will hold a copy of core Tableau internal Postgres tables (i.e. workbooks) along with any necessary data enrichments.


      - Log into the internal Tableau Postgres as admin and issue a GRANT to pg_largeobject.  If you don't want to have to do this in the future, then go upvote idea https://community.tableau.com/ideas/4667


      - Create a dblink between the newly created Metadata Postgres staging and your Tableau internal Postgres


      - *** Work with the C# Metadata Factory.  You will need to modify app.config to point at your metadata instance and then compile it to an .exe and then schedule it at a reasonable frequency based on your deployment.  This module is the secret sauce that populates table "fields" and table "xref_workbook_datasource".  At a high level, it unzips the XML from pg_largeobject, flattens out the XML, retrieves statistics on cardinality, min, and max values, and does a few data enrichments (i.e. unused fields).  The executable will drive the pull from Tableau metadata to the Metadata Postgres staging, will do the parsing, and then it will populate the Metadata Postgres prod.


      - Once the data is available within the Metadata Postgres prod schema, then you can create whatever metadata reports that you want.  Within github are two examples.





      Perfection?  No, not quite.  Or probably more like far from it.  But, this can be community driven.  If you have ideas, etc. then let's see if we can work together.


      Will it work with future iterations of Tableau?  Don't know, but we are as of now committed to keep it working as we upgrade.  Oh, and I already said that it works with 10.1, it will work with earlier versions with a few small adjustments.


      What if there are errors?  It takes some setup to get it all functional, but once it's going, it's there.  Feel free to respond here or email me if you want at: jstrauss@conversantmedia.com or jeffstrauss18@gmail.com


      Cool?  If so, thanks.  If not, that's ok too.


      Want to see a demo?  See this post.   Tableau Administrators Virtual User Group: February 24

        • 1. Re: Metadata Automation
          Trevor Jacobson

          Thank you Jeff for this outstanding solution for providing deep insight into the meta data for Tableau server objects.  It is going to be unbelievably useful for us as we continue to expand our user base and number of published objects.  I have been struggling since we first implemented Tableau server about a year and a half ago with how to document all of the many things that happen within workbooks and datasource creation so that we can drive standardization and evaluate impacts created by changes to data sources in our EDW.  Something like this seems like a virtual necessity for any enterprise implementation of Tableau server.  I'm hoping that much of this functionality will become native in future releases but this will do the trick for now.


          The materials provided via the github link were pretty intuitive, even though I'm a novice when it comes to Post Gres and programming.  With just a few very minor modifications (Thanks Jeff!)   I was able to get this up and running on version 10.0.5 and guessing the same modifications would get it going for versions back to at least 9.3.


          Well done!

          1 of 1 people found this helpful
          • 2. Re: Metadata Automation

            I started with this today and ran into an installation issue. I have Visual Studio 2017 and apparently the setup projects are disabled in that version. How to get around that? Should I install a lower version?

            How do you set up the metadata postgres instances? I'm trying to run this from a remote machine as I don't want to mess with the postgres instances in the tableau server box.

            • 3. Re: Metadata Automation
              Jeff Strauss

              In terms of the metadata postgres instance, you should install this on a non Tableau Server box as you don't want to interfere with TS functionality.


              For the setup projects, google "microsoft visual studio installer projects" or try using Microsoft Visual Studio 2015 Installer Projects - Visual Studio Marketplace .  I'm not sure if 2017 is available yet.



              If you have more questions, feel free to email at jstrauss@conversantmedia.com

              1 of 1 people found this helpful
              • 4. Re: Metadata Automation

                Thank you Jeff for this excellent program. It has proved invaluable for us to dive into our data sources and make it more efficient. The insights gathered from this has helped us talk to publishers/users and identify best case performance across the board.


                I want to echo Trevor and wish this is made an out of the box functionality with Tableau.

                • 5. Re: Metadata Automation
                  Tony Chan

                  Hi, curious if anyone has used this with 10.5.1 and if any code changes were required to work with 10.5?  We are interested to get field lineage and to see impact of changing database fields.  Thanks, Tony

                  • 6. Re: Metadata Automation
                    Jeff Strauss

                    We are using it with 10.5.3 and no changes were required when we upgraded from 9.2.6

                    1 of 1 people found this helpful
                    • 7. Re: Metadata Automation
                      Tony Chan

                      Thanks Jeff

                      • 8. Re: Metadata Automation
                        Shon Thompson

                        Excellent tool Jeff!  I just got it working on version 2018.3.  The workbooks are a great addition.  I'm curious why you chose to use the tableau repository directly for background tasks data source instead of going through the metadata factory?  I guess that data is more valuable if real-time, but I was curious if there was any other reason.  Thanks for sharing this!

                        • 9. Re: Metadata Automation
                          Jeff Strauss

                          Hey Shon.  Great to hear that it's working well for you.  For the background tasks, there was no need to use the metadata factory because the data is already easily digested without having to parse it out.  We could just read it direct.