    Extract Doubles in Size when Refreshed with Extract API

    Jeff O

      Hi All,


      I have an extract (hyper not TDE) that is about 500 kb when first created using python in the Tableau Extract API.  There are about 9500 rows of data in the extract table.  When I try to refresh, adding only 200 rows, to the 1 table in this extract, the hyper file almost doubles in size to 960 kb, and I cannot figure out why.  This is a an issue seeing as how we'll have much bigger extracts to deal with in the future.


      I have tried to debug it and have figured out that it must have something to do with the schema.  Even when I try to add a single blank row, as seen in the code sample below, the size still increases from 500 to 960 kb.


      Here is the code I'm using to try and update the extract.  Thanks in advance for any help:


      from tableausdk import *

      from tableausdk.HyperExtract import *


      if extract.hasTable('ExisitingTable'):

        table = extract.openTable('ExisitingTable')

        schema = table.getTableDefinition()



        Create New Table Code



      # Append new rows to existing table - size still almost doubles with a blank row.  I removed the code for adding the actual data to rows for readability,  since the size still blew up with out it.


      for i, r in final.iloc[:1,:].iterrows():

         row = Row(schema)

         table.insert(row) # At this line the size increase takes place.


      # close the extract so we can use it.