6 Replies Latest reply on Sep 4, 2018 4:28 AM by Philip Shemella

    Desktop - switch between datasources with identical data motel

    Philip Shemella

      I have multiple clients with whom I share a TWBX file, generated by my local installation of Tableau Desktop. The reports are 100% identical but the data differs based on the client.

       

      For versioning purposes, I want (need?) to have only one TWB file, where any small change can be made to the master file and doesn't need to be replicated between multiple client files.

       

      I can choose where and what is my data source, which includes multiple tables that are joined in Tableau:

      • CSV, where the full path of the CSV file is unique to the client: /full_path/client_1/data.csv
      • CSV, where the file name of the CSV file is unique to the client: /full_path/client_1_data.csv
      • SQL (postgresql), where the schema is unique to the client: select * from client_1.data_t
      • SQL (postgresql), where the table name is unique to the client: select * from default.client_1_data_t
      • SQL (postgresql), where the database is unique to the client (haven't tried yet)
      • SQL (postgresql), where parameter defines the custom SQL connection (haven't tried yet)
      • BigQuery (perhaps with unique G+ login?)

       

      From the group of 5-6 files or tables, I do several joins and have approx. 4 connections (so manually switching each connection each time I want to switch between clients is cumbersome).

      Screen Shot 2018-09-04 at 10.02.36.png

       

      Is there an easy way to switch data sources (parameter?!) so I don't have to go into the settings of my report each time?

       

      Because of sensitivity of client data, I can't create CSV/SQL tables of the entire dataset and then just filter each report (since the underlying data of all clients is still there). There seems to be no way to create a multi-datasource filter that is driven by a parameter.

       

      I would prefer CSV, since everything is local and the writing time is faster than the local Postgresql and much faster than BigQuery.

       

      (using Mac Desktop 2018.2)

        • 1. Re: Desktop - switch between datasources with identical data motel
          Chris McClellan

          Philip Shemella wrote:

           

          There seems to be no way to create a multi-datasource filter that is driven by a parameter.

           

           

          Why not create a field to store this in the data ?

           

          There's a few ways to solve this, but the simplest might be a "client" filter in the data and then you simply change the filter, create the extract and save the workbook.

           

          You could also play with the XML of the TWB file if you wanted to use either of your first 2 methods.  Keep a backup just in case, but I would suggest creating a batch file (maybe in Python) to change the data source details in the XML.

          • 2. Re: Desktop - switch between datasources with identical data motel
            Philip Shemella

            If I export ALL data, then I do indeed have a field that I can filter on. But the filter needs to be at the highest level (the data connection), and not at the report level.

             

            At the moment, I'm trying out with a parameter in the 4 data source filters. It should work in theory, but I'm getting strange results after I refresh (no data). It doesn't refresh automatically when I change the parameter.

             

            Screen Shot 2018-09-04 at 10.21.32.png

            • 3. Re: Desktop - switch between datasources with identical data motel
              Philip Shemella

              I did a test of the data source filters. With one client selected, by parameter, and a live CSV datasource (no extract), I did an "Export Packaged Workbook". The data that the report was showing was correct, but I wanted to see what data was shipped with the file.

               

              With that test file, which should have a subset of the data, I renamed from TWBX to ZIP, unpackaged, when to the datasources folder, opened a CSV, and found ALL the data.

               

              So on privacy grounds, the data source filter doesn't seem to be very useful in my case with CSV files. Additionaly, sharing a workbook with a parameter for other clients isn't desirable (but it could be hidden).

               

              Here's my parameter-based data source filter:

               

              Screen Shot 2018-09-04 at 10.21.32.png

               

               

               

              Screen Shot 2018-09-04 at 10.39.53.png

              • 4. Re: Desktop - switch between datasources with identical data motel
                Philip Shemella

                Next attempt is to connect to Postgresql instead of CSV files (I can choose where and how data is written).

                 

                Since the schema determines the "path" of the tables, and the tables are identically named, I've set a custom SQL query with a parameter to determine the schema.

                 

                When I type the schema name in manually, it works. When I assign it from a parameter, it does not.

                 

                Screen Shot 2018-09-04 at 10.49.16.png

                 

                 

                Not possible due to this (preventing SQL injection)

                 

                Re: Parameters in Custom SQL "from" clause?

                • 5. Re: Desktop - switch between datasources with identical data motel
                  Philip Shemella

                  What does work is to make a SQL connection to ALL the data with a data source filter that includes the parameter matching.

                   

                  Then from the TWB file do an "Export as packaged workbook". This workbook, as TWBX, contains a .hyper extract that only has the relevant data.

                   

                  I'll leave the question open because I'm still looking to do it on a per-client basis, and not from the full dataset and parsed down per customer. If I use Tableau Online, then I would have to carefully check what data is uploaded and is available (even if hidden) to the user.

                   

                   

                  UPDATE: since I am using Postgresql instead of CSV, I will create unique databases, and when I open the TWB file, because I am prompted for the DB password, will manually set which client the report corresponds to. Then I can be sure there is no cross-contamination of the data with data sources filters.

                   

                   

                  UPDATE2: it does work to switch databases when opening a report, but it's not so consistently asking for log-ins, so I have to manually "edit connection" for each connection each time I want to work with a different client's data.

                  • 6. Re: Desktop - switch between datasources with identical data motel
                    Philip Shemella

                    UDPATE3: Using Tableau's document API (python), I was able to make a modified script that loops over all connections and sets them to a new parameter. This is much safer and easier than modifying multiple connections in Tableau Desktop itself.

                     

                    Note that at the time of writing, documentation has the object "sourceWB._datasources" without the underscore, which is an error.

                     

                    #!/usr/bin/env python
                    # -*- coding: utf-8 -*-
                    
                    # modified from: http://tableau.github.io/document-api-python/docs/
                    
                    from __future__ import print_function
                    from tableaudocumentapi import Workbook
                    
                    def main():
                        file_path = 'report.twb'
                        db_name = 'new_db'
                        sourceWB = Workbook(file_path)
                    
                        for i in xrange(len(sourceWB._datasources)):
                            print ('INFO: datasource',str(i))
                            for j in xrange(len(sourceWB._datasources[i].connections)):
                                print ('INFO: connection',str(j))
                    
                                if sourceWB._datasources[i].connections[j].dbname != db_name:
                                    print('INFO: old',sourceWB._datasources[i].connections[j].dbname)
                                    sourceWB._datasources[i].connections[j].dbname = db_name
                                    print('INFO: new', sourceWB._datasources[i].connections[j].dbname)
                                    print() 
                                else:
                                    print('INFO: nothing to change')
                    
                                # other db params to change
                                #sourceWB._datasources[i].connections[j].server = "MY-NEW-SERVER"
                                #sourceWB._datasources[i].connections[j].username = "postgres"
                    
                        sourceWB.save()
                    
                    if __name__ == "__main__":
                    
                        main()