5 Replies Latest reply on Jul 18, 2016 1:07 AM by Simon Runc

    SQL Extract Management

    William Bohn

      Hi Everyone,

       

      I have been having some difficulty with the best practices of data source management and am hoping someone has some ideas for me to improve the process.

       

      Background: I have a desktop license and am connecting to our cloud data through a Microsoft SQL Server query and want to have my data sourced on Tableau Online's server. I have been able to successfully create and publish a live connection, but am having issues with latency associated with the live connection. My data set is extremely large, with roughly 120k lines of data x 50 or so columns per day over the last three years. My goal is to create an extract that refreshes once a day (at night) and only updates incrementally to help minimize the resource impact.

       

      Problem: Creating the extract within Tableau is impossibly time consuming with my computer's resources - I ran an extract beginning January 1st, 2016 through yesterday and it took a couple hours. To add insult to injury, I was unable to get the extract uploaded to Tableau Online because I timed out after 2 hours due to the file size. I tested other extracts with less data and successfully was able to get the extract published and refreshed to the current date - my big question here is how can I backfill the data? I assume there is no way to get around having to create the extract on my computer... My goal would be to create a smaller extract (say the past week), upload that to Tableau's Server and then run a full refresh on the server to fill in dates going back to 1/1/14. Once that is established, I'm pretty confident I can setup the refresh to append the data set every night with the new data. I'm just getting hungup on how to get the historical data into the extract...

       

      Thank you in advance!

       

      -Will