1 Reply Latest reply on Oct 20, 2017 9:45 AM by Tom W

    Appending Tableau Server extract refreshes

    Matt Schutz

      We are finally moving our team toward using Tableau Server data sources with automated/scheduled extract refreshes rather than embedded data sources.  A great first step!

       

      What we have realized is that we have so much data that full refreshes are getting slower and slower every day.  Data is added to the source system every day and when the refresh runs it's getting one more day of information every day and just taking longer and longer.

       

      I'm familiar with the concept in Tableau Desktop of appending data, but I haven't done such a thing for data sources on Server.

       

      The Hadoop data is something like this (obviously simplified for this example):

       

      record_date     field1     field2     field3

       

      I would like for the extract to be populated each day with any new data in Hadoop.  The pseudocode would be something like

       

      INSERT INTO TableauServerExtract (SELECT * FROM HadoopTable WHERE record_date > max(record_date in TableauServerExtract))

       

      It would probably also be a good idea to do a full refresh monthly or something. 

       

      I found this, but it doesn't give much detail.  I'm loosely familiar with tabcmd so if that's of use I can possibly look into that would prefer avoiding it.  Can someone point me in the right direction please?

       

      Thanks!