1 Reply Latest reply on Aug 2, 2018 9:40 PM by patrick.byrne.0

    How to Use a DB2 Stored Procedure as a Tableau Data Source

    David Taylor

      About This Article

      Over the past few weeks, I was assigned the task of connecting Tableau to a DB2 Stored Procedure. This interaction needed to be a live connection, and it needed to accept parameters, just as Tableau supports live connections, with parameters, to a MSSQL or Oracle Stored Procedure. This is not currently supported, but I found a nifty workaround based on using MSSQL as a proxy between Tableau and DB2. I called this technique CLADS ("Command Line As Data Source"), and it is documented in another discussion here. This article will provide specifics for my relevant DB2 use case.

       

      If this article is helpful to you, please click "like" and maybe add a comment, so others can see this article a little more readily in search results.

      Step-by-Step

      1. My DB2 stored procedure reads as such:

      CREATE or replace PROCEDURE TaylorProc2
      (IN in_pn varchar(200))
        SPECIFIC TaylorProc2
        LANGUAGE SQL 
        DYNAMIC RESULT SETS 1
        Re:  BEGIN
          DECLARE c_emp CURSOR WITH RETURN FOR
      select * from "Batters" where "Player"=in_pn;
          OPEN c_emp;
        END Re
      

      It's pretty simple: given a single parameter, a player name, the procedure returns a row for each year of that player's career, for each team he played for. Additional columns relate the number of games, hits, runs, etc:

      2. I then authored some Python code to CALL that stored procedure. The Python code accepts six string parameters: the first will be assumed to be the name of the stored procedure (TaylorProc2 in my case). The six subsequent parameters will be added to the CALL if and only if they are not dashes. ('-').

      Example: if the Python script is called with the syntax:

      db2sp.py TaylorProc2 'Kirby Pucket' '-' '-' '-' '-'
      

      The statement sent to DB2 will be

      CALL TaylorProc2 'Kirby Puckett'
      

      whereas if the Python script is called with the syntax:

      db2sp.py myProc 'Order123' '2015-06-30' '-' '-' '-'
      

      The statement sent to DB2 will be

      CALL myProc 'Kirby Puckett' '2015-06-30'
      
      

       

      The Python code, which I named "db2sp.py", follows:

      # Import a couple of Python libraries
      import ibm_db
      import sys
      
      # For debugging: display the arguments
      print ('#Arguments: ', str(sys.argv))
      
      # Set up connection object
      conn = ibm_db.connect('DATABASE=TESTV1;HOSTNAME=db2server.test.aiy.lan;PORT=50000;PROTOCOL=TCPIP;UID=test;PWD=my_password;','','')
      
      # If the connection is successful...
      if conn:
          # Set up the CALL statement.
          sql = "CALL " 
          # The first argument must be the name of the stored procedure
          sql += str(sys.argv[1]) 
          sql += " ("
          # Arguments will subsequently be added if and only if they are not '-'. They need to be sequential (a first is required before you can add a second, etc.)
          if not str(sys.argv[2]) == '-':
              sql += "'" + str(sys.argv[2]) + "' "
          if not str(sys.argv[3]) == '-':
              sql += ",'" + str(sys.argv[3]) + "' "
          if not str(sys.argv[4]) == '-':
              sql += ",'" + str(sys.argv[4]) + "' "
          if not str(sys.argv[5]) == '-':
              sql += ",'" + str(sys.argv[5]) + "' "
          if not str(sys.argv[6]) == '-':
              sql += ",'" + str(sys.argv[6]) + "' "
          sql += ")"
      
      #For debugging: display the CALL statement
          print ('#SQL: '+sql)
      
          # Execute the CALL, and fetch the first rows
          stmt = ibm_db.exec_immediate(conn, sql)
          row = ibm_db.fetch_tuple(stmt)
          while row != False :
              print (row)
              row = ibm_db.fetch_tuple(stmt)
      

      Python assumes the procedure has only one result set, and it's not extremely robust. It would be nice to have a little better handling of the parameters and their data typing, for example, but I'm confident it would be pretty easy to enhance to meet most customer requirements.

       

      3. I then authored a MSSQL Stored Procedure to call Python. This is slightly more advanced than the "standard" CLADS stored procedure, so it is provided here. All procedures called by Tableau are assumed to have only one result set, and this procedure is no different.

      USE [miscdata]
      GO
      /****** Object:  StoredProcedure [dbo].[db2sp]    Script Date: 7/20/2018 4:51:53 PM ******/
      SET ANSI_NULLS ON
      GO
      SET QUOTED_IDENTIFIER ON
      GO
      ALTER PROCEDURE [dbo].[db2sp] 
      -- Add the parameters for the stored procedure here
      @string_procedure_name varchar(50),
      @string_parameter1 varchar(255),
      @string_parameter2 varchar(255),
      @string_parameter3 varchar(255),
      @string_parameter4 varchar(255),
      @string_parameter5 varchar(255)
      AS
      DECLARE @os_command varchar(255)
      BEGIN
      -- SET NOCOUNT ON added to prevent extra result sets from
      -- interfering with SELECT statements.
      SET NOCOUNT ON;
      
      
      -- assemble the command to be sent to the OS. Tableau should always pass six parameters. Put '-' into unused parameter slots. The path to python.exe and the PY script are hard-coded herein to deter malicious behavior.
      SET @os_command = 'C:\...\python.exe C:\myscripts\db2sp.py "' 
      + @string_procedure_name + '" "'
      + @string_parameter1 + '" "'
      + @string_parameter2 + '" "'
      + @string_parameter3 + '" "'
      + @string_parameter4 + '" "'
      + @string_parameter5 
      + '"'
      
      
          -- Create a temp table.  
      DECLARE @MyTempTable TABLE
      (id int identity(1,1), output nvarchar(2000) null)
      
      
      --Call the @os_command and put its output into the temp table.
      insert @MyTempTable (output) exec xp_cmdshell @os_command
      
      
      --Return the temp table to the calling procedure
      select * from @MyTempTable where output is not null order by id
      
      
      END
      

      In my environment, I had to change permissions a bit: the MSSQL service account lacked the permissions to call Python, so I elevated that user to have the requisite permissions.

       

      4. At this point, you have a fully supported Tableau data source, so simply set up Tableau to query this MSSQL (not DB2) stored procedure, db2sp, and satisfy its six parameters. The procedure returns the data in one comma-delimited string, so you'll need some parsing logic to break up the string. Here's mine working in Tableau Desktop 2018.1:

      Epilogue

      I hope you find this helpful! Please click Like, and/or add a comment if you have any questions, clarifications, enhancements, or examples of good use cases for this or for CLADS in general.