1 Reply Latest reply on Aug 3, 2018 9:04 AM by patrick.byrne.0

    How to use any command line program as a Tableau data source

    David Taylor

      About this article

      This article will discuss a technique that I uncovered recently that will allow Tableau to connect to virtually anything as a data source. Moreover, unlike the Web Data Connector, this technique supports live connections from Tableau server that are parameterized. The technique requires an instance of Microsoft SQL Server to serve as a proxy between Tableau and the data source. The only requirement for the underlying data source is that it must be something callable from the command line that returns output back to the command line console. As such, I'm going to give this technique the horrible acronym CLADS: "Command Line As Data Source"

       

      I used CLADS to query a DB2 stored procedure, which isn't supported by Tableau. I will use that as an example herein. My stored procedure returned career statistics for any baseball player, so it required exactly one string parameter: the player name. I authored another article here to break down that specific DB2 use case in detail, but in this article I focus just on CLADS and leave the DB2 stuff out of it.

       

      If this article is helpful to you, please click "like" and maybe add a comment, so others will see this article a little more easily in search results.

      How does it work?

      Step-by-step at a high level, here's how CLADS works:

      1. Tableau is configured to query a MSSQL stored procedure. Thereby, Tableau discerns from MSSQL what parameters are expected by the stored procedure, and how to satisfy each of them with either a static value or a parameter. The author can specify whether to use a Live or Extract connection as usual, and set up whatever authentication to MSSQL is required. This is all very standard behavior. Then Tableau calls an EXECUTE command on MSSQL, passing in the relevant parameters. So, for my example, Tableau called out to MSSQL with a command like this: "EXECUTE baseball_stored_procedure 'Kirby Puckett'".
      2. MSSQL now does a couple of things: first, it creates a simple temp table to store results. Then, things get exciting: our stored procedure calls out to the OS to do something else, via another procedure called "xp_cmdshell". Xp_cmdshell can take parameters, too, so before calling it, the stored procedure code needs to convert Tableau's parameters to the proper syntax. In my example, there wasn't much that needed to happen here. MSSQL called a Python script to work its magic: "insert #output (output) exec python.exe C:\myscripts\db2sp.py  'Kirby Puckett'"
      3. Now it's simply the job of the OS to issue the command, and return the results to standard output. In my case, Python queried DB2 and returned back a bunch of career statistics on one of my childhood heroes:
      4. MSSQL then stores all those results in a temp table, and returns the entire temp table back to Tableau. The temp table stores each CMD line as a single field, so Tableau would probably need to parse it to make it useful, but that's very easy to do in Tableau Desktop. In my example, I parsed "('Kirby Puckett', 'Minnesota Twins', 'AL', 1987)" into four columns using the SPLIT, MID, and similar functions.

      Step-by-step example

      1. Befriend your MSSQL administrator and ask him or her to create the following MSSQL stored procedure. This procedure doesn't do much -- it issues a "DIR" command returning a list of all the files in the directory of your choosing. The @string_cmd parameter reflects the directory. Note this code allows for some malicious behavior -- we're allowing MSSQL to execute an OS command, and we're just trusting that the OS command isn't 'DIR c:\temp & DEL c:\*', which would delete the entire c-drive. So please set this up carefully. It's also possible that your MSSQL instance has xp_cmdshell turned off entirely, or that the service account under which MSSQL runs has very limited privileges, so you may have to muddle your way through that as well.

       

      CREATE PROCEDURE [dbo].[clads] 
      -- Add the parameters for the stored procedure here
      @string_cmd varchar(255)
      AS
      DECLARE @os_command varchar(255)
      BEGIN
      -- SET NOCOUNT ON added to prevent extra result sets from
      -- interfering with SELECT statements.
      SET NOCOUNT ON;
      
      -- assemble the command to be sent to the OS. 
      SET @os_command = 'dir ' + @string_cmd
      
          -- Create a temp table.  
      DECLARE @MyTempTable TABLE
      (id int identity(1,1), output nvarchar(2000) null)
      
      --Call the @os_command and put its output into the temp table.
      insert @MyTempTable (output) exec xp_cmdshell @os_command
      
      --Return the temp table to the calling procedure
      select * from @MyTempTable where output is not null order by id
      
      END
      

      2. Test your new stored procedure from a MSSQL query window, using the command: exec clads 'c:\windows' or similar.

       

      3. In a new workbook in Tableau Desktop, choose "Connect to Data" / "Microsoft SQL Server", then enter your credentials.

       

      4. Drag the "clads" stored procedure over to the canvas, and specify your directory when prompted for parameters. "C:\windows" should be a good one. Click "Update Now" to see your result set! Author whatever dashboard you feel most compellingly displays the files in your directory of choice.

       

       

      I hope that's helpful! I plan to author a companion article giving a step-by-step example of my specific Python/DB2 use case. And I welcome comments on other use cases where this might be useful. Thanks for reading!