    Large tableau extract freezes before completing

    Gordon Douglas

      In tableau desktop I have a workbook connecting to an oracle database that is pulling in a large number of records (~60mil).  I am hitting a dozen or so tables being pulled in with a fair amount of joins.  I only have read only access to this database and no changes are allowed to it and so I can't create views or have the db admin create views.  During what appears to be the final step of the extract creation process the desktop application hangs and windows reports it as non-responsive.  The reason I say the final step is that the extract file is in the location and if I open it as a data source in another workbook, the data all appears to be in there and complete.  The process explorer shows that tableau is not hung, but hitting the disk hard with writes, and when I look at the temporary files location it is writing a file called spool.randomnumberhere.tmp indefinitely until that file fills the disk, at which point tableau will unhang itself and tell me that the connection to the data source has been lost.  If I reduce the dataset I can get the process to complete successfully, though I'm not sure where the limit is that allows it to complete, and then I end up missing some of the data I need for the workbook.  It only seems to be above some threshold that this file seems to grow indefinitely.  I did not change my queries other than adding a where statement to limit the records to get it to complete.  The extract itself is only roughly 300Mb in size, but this spool file seems to have no limit.  I've tried adding more disk space thinking it really did need more but it just fills all available space (it easily eats hundreds of GB).  It seems to be after all of the steps where the extract process is hitting the database as I can disconnect from our vpn and tableau doesn't say anything about problems connecting to the datasource (and again, the extract file itself actually seems to be complete).


      Desktop version 2019.4.1 on Windows 10.

          Gordon Douglas

          After some further investigations it happens only at the end of the extract process, tableau runs a sum query

          SELECT SUM(1) AS "cnt:TEMP(one)(290714814)(0):qk"

          FROM "Extract"."fullextractquerydefinitionhere"

          HAVING (COUNT(1) > 0)

          which simply just fills the disk to capacity with that spool tmp file.  The query I am taking is doing an expansion of records from a date table that has a start and end date cross joined to a table that populates all the dates in between so that several other tables that are dates specific can match up against the specific dates.  I am running the extract as a multi table to avoid having tableau explode all the dates out in a single table extract.  I've tried about a half dozen iterations of the queries but they all cause this error to happen when I take an extract.


          I've seen some other forum topics about using connection customizations to prevent this sum query from being run against a database but I don't see any way to stop it from happening against the extract.