5 Replies Latest reply on Oct 9, 2018 6:18 AM by Neha Tharani

    Tableau and Python

    Neha Tharani

      Hello All,

      This is my first question here.

      I have a python script which is giving me the required output. However the moment I run it in tableau (using SCRIPT_REAL) it says TypeError : Object of type DataFrame is not JSON serializable. the script is attached below:

       

      SCRIPT_REAL("

      import quants_core

      import numpy as np

      import pandas as pd

      from datetime import datetime

      from quants_core.stats import get_stats_all

      coll = pd.read_excel(r'C:\Users\ntharani\Desktop\Data.xlsx', index_col=None, na_values=['NA'], usecols = 'A,D')

      print(coll.head())

      coll.set_index('Dates', inplace=True)

      coll.index = pd.to_datetime(coll.index, format='%Y-%m-%dT%H:%M:%S.%fZ')

      #print(coll)   # Returns are read and stored here

      yy = get_stats_all(coll)

      zz = yy.transpose()

      zz = yy.values.T.flatten()

      print(zz)

      return zz.values.tolist()

      ",[Return])

       

      The get_stats_all function returns a DataFrame with 1 row and 22 columns, The output is getting displayed on command line but somehow it is not getting displayed in Tableau views.

       

      Any suggestions or improvements are appreciated. Thanks in advance.

       

       

      PS: I cannot attach workbook due to confidentiality issues.

       

      Message was edited by: Neha Tharani

        • 1. Re: Tableau and Python
          Jeff Strauss

          What parameters are you passing to tabpy?  Can you try adding .tolist() to your return(yy) like return(yy.tolist()) ?  If this doesn't work, It may be beneficial for troubleshooting if you can attach your workbook and xlsx.

          • 2. Re: Tableau and Python
            Jonathan Drummey

            Hi Neha,

             

            I'm barely a Python user, but I do know Tableau fairly well, so hopefully I can help. Tableau's SCRIPT_ functions each return a single field, so if you are using SCRIPT_REAL then it's expecting to be given single number. I'm guessing the get_stats_all() function is returning a data frame and the error message that Tableau is generating is effectively saying that it can't turn the data frame into a single vector of numbers.

             

            Assuming that's what's happening then there are two options:

             

            1) Add another line or two of code to process the data frame to return a single field/vector's values. If you're looking to return multiple values from get_stats_all() then you'd need to have multiple calculated fields, each with their own call to a SCRIPT_ function.

             

            2) The alternative would be to use a SCRIPT_STRING() - potentially with some extra code to format the output - so you can return a single string and then in downstream calculations parse out those results into separate numbers.

             

            More generally I don't know what statistics the get_stats_all() is returning, if it's summary statistics then it's possible to build calculations in Tableau without using Python, so you might check that out.

             

            Hope this helps!

             

            Jonathan

            1 of 1 people found this helpful
            • 3. Re: Tableau and Python
              Neha Tharani

              Hello Jonathan,

               

              Thank you for your response. You got it correct. It does returns a Dataframe with summarized statistics and yes it does returns multiple values.

               

              About using tableau to calculate the stats, I am instructed to use python only.

               

              1) using multiple SCRIPT_REAL doesn't seem ideal because the total number of stats returned are 22 (might increase in future)

               

              2) I think SCRIPT_STRING is the way to go. I request you to please elaborate "potentially with some extra code to format the output - so you can return a single string and then in downstream calculations parse out those results into separate numbers. "

               

              Thank you.

              • 4. Re: Tableau and Python
                Jonathan Drummey

                Hi Neha,

                 

                In your Python code you'd need to turn the data frame into a string of numbers (and potentially text) with defined separators, maybe something like:

                 

                1.0~23~~3.14~~~red

                 

                Then you'd need a set of N calculations (four in this example) to get each value, here's a screenshot:

                 

                Screen Shot 2018-10-08 at 11.43.15 AM.png

                 

                As noted in the comment the formula is more complicated because (as of at least Tableau v2018.3) SPLIT, FINDNTH, and REGEX_EXTRACT_NTH aren't supported by table calculations (and the SCRIPT functions that Python is executed in are table calculations), vote up https://community.tableau.com/ideas/6239 if you'd like to get that.

                 

                Note that if the Python script is returning a changing number of variables in each run there's no automatic way to parse out into a variable number of calculated fields - you'll need to build as many calculations as you expect to have variables.

                 

                I've attached a v10.5 workbook I used to make the screenshot above.

                 

                Jonathan

                 

                 

                PS: I know this is really painful, I had a conversation with a Tableau developer last month and I know they are looking into this. I'll also forward this thread as an example.

                1 of 1 people found this helpful
                • 5. Re: Tableau and Python
                  Neha Tharani

                  Dear Jonathan,

                   

                  This is really helpful. Thanks a lot for the reply, appreciate it.