5 Replies Latest reply on Jul 23, 2018 12:27 PM by Ankit Patel

    Tableau Server Client python API not finishing extract refresh

    Ankit Patel

      Hi,

      I am trying to use the TSC api in python to execute a refresh of an extract that is associated with a workbook published on Tableau Server 10.5.

      The extract is tied to a data source on Amazon Athena.

      I am able to successfully "kick off" the refresh through my code, which I can tell by the change in timestamp on the extract. But it never finishes, since the time stamp doesn't change again and I don't see any new data displaying in the published workbook.

       

      I'm not sure if it is something with the Tableau Server itself having issues, or if I am programmatically making a mistake.

      Would anyone be able to shed light on it? (my code below)

       

      import tableauserverclient as TSC
      
      
      tableau_auth = TSC.TableauAuth('<user_id>', '<pw>', site_id='')
      server = TSC.Server('<tableau server address>')
      
      
      def refresh_tableau_dashboard(workbook_name, refresh = False):
          with server.auth.sign_in(tableau_auth):
              server.version = '2.8'
      
      
              # request all workbooks on this Tableau server
              all_workbooks, pagination_item = server.workbooks.get()
              
              # loop through all workbooks to find desired one to retrieve it's workbook_id
              for workbook in all_workbooks:
                  if workbook.name == workbook_name:
                      # print a few key bits of information 
                      print("Workbook Name: ",workbook.name)
                      print("Workbook ID: ", workbook.id)                
                      print("created at: ", workbook.created_at)
                      print("last updated at: ", workbook.updated_at)
                      print("project name: ", workbook.project_name)        
                  
                      # this will trigger a refresh of the extract for the given workbook_id
                      if refresh:
                          try:
                              response = server.workbooks.refresh(workbook.id)
                          except Exception as e:
                              response = e
                      else:
                          response = "Refresh was not requested"
          return response
      
      
      # call the refresh
      refresh_tableau_dashboard("my_workbook_name"
      
      
      
      
      
      
      
      
      , True)
      

      Thanks in advance!

        • 1. Re: Tableau Server Client python API not finishing extract refresh
          Dylan Bergey

          Ankit -

           

          Have you been able to refresh the extract manually? How long did it take? How large is the source data?

           

          My understanding of the python data extract API, is that it doesn't handle strings very well, and it when creating an extract it's essentially creating it cell by cell.

           

          Do you have to use python, and do you have to use Athena? Could you move the data somewhere else?

           

          Take a look I linked below.

           

          Performance Issues with Python and Tableau Data Extract API

          1 of 1 people found this helpful
          • 2. Re: Tableau Server Client python API not finishing extract refresh
            Ankit Patel

            Hi Dylan,

             

            Appreciate your prompt response!

             

            Manually refreshing it takes 6mins.

            The data set is 45 columns by approx 700k rows, and about half of my columns are string/text type.

             

            A little bit more background on our broader process:

            We are running some ETL in Pyspark and would prefer to stick to python.

            We deposit our finalized data sets on S3 and plug into them with our dashboards, from which we create extracts.

            The Spark jobs are called on-demand, so having scheduled extract refreshes won’t work for us, which is why I’d like to use Python to kick off the refresh once the Spark portion is done.

             

            We were contemplating trying to use Redshift Spectrum instead of Athena to bypass any potential timeout issues, but haven't tested that yet. But given the manual refresh takes 6mins, I doubt we should have timeout issues on this.

             

            To your point about Python API having issues with strings, forgive my ignorance, but doesn't using the API in any language simply trigger the process of a full extract refresh as if I had done it manually?... and then Tableau server goes to work?

             

            Thank you for the link, I’ll explore what they’ve tried and see what can be applied to our stack.

            • 3. Re: Tableau Server Client python API not finishing extract refresh
              Dylan Bergey

              Yes, forgive my mistake. The issue of the API creating an extract cell by cell only pertains to a local extract, not one created via Tableau Server.

               

              Have you tried setting up a job that utilizes tabcmd?

               

              tabcmd Commands

               

              Also, what about a live connection to Athena? I haven't worked extensively with Athena in over a year, but depending on your data size and server size, you might be able to get away with a live connection.

              1 of 1 people found this helpful
              • 4. Re: Tableau Server Client python API not finishing extract refresh
                Ankit Patel

                We are currently using a live connection to Athena, which creates a slow user experience when interacting with the dashboard.

                This is why we are trying to create an extract. This dataset will continue to grow as time goes on, another reason live connection wouldn't suit us.

                 

                Tabcmd is an option we'll explore, but so far it seems to be very involved in setting that up. I was hoping that the pythonic API call would be a nice clean way to just trigger the refresh process.

                Will see how involved tabcmd is and report back.

                Thanks again!

                • 5. Re: Tableau Server Client python API not finishing extract refresh
                  Ankit Patel

                  Ultimately we stuck with the original code, as it finally seemed to do what we wanted. Strange, but our tableau server seemed to be having issues the week we were testing. Dylan's tabcmd solution also seems to have legs, although he we didn't fully flesh out a solution there. Thank you.