While working on to fine tune one of our Tableau extract performance we noticed that the Tableau backend query is actually first fetching the full data in to the memory and on top of it it is applying the Incremental date as a filter in the where clause. This is because you all know that When we write a Custom SQL Tableau will treat it as a sub-query and execute it in the database.
We have a huge dataset where it's taking almost 2 hours to complete the incremental data extract.explored the options to minimize this execution time. As part of that, We used Initial SQL and wrote the following query to set the date(hard coded) for which we need to extract the data.
Initial SQL Statement:
Now we used the date that is coming from the initial SQL in the custom SQL.
Custom SQL Query :
CAL_DIM.CAL_DIM_ID = AAA.SNP_DT_ID
--AND CAL_DIM.CAL_DT = to_date('20170125','YYYYMMDD')
AND AAA.SNP_DT_ID = to_number(to_char(to_date(USERENV('client_info'))-1, 'YYYYMMDD'))
This approach reduced the execution time from 2 hours to 2 mins.
Now the problem is : I need to tell Initial Sql/Custom SQL that what date to use for Incremental and what date to use for Full extract data refresh. Incremental atleast we can have something and can be implement. But, for full extract, we don't have a way to tell what date range it has to select to_number(to_char(to_date(USERENV('client_info'))-1, 'YYYYMMDD')) field.
For this we don't have an OOB option to pass the variables to Initial SQL.
I am sure Tableau will be saving the date or some info in to its memory which will be used to execute full or Incremental refresh when we schedule the extract refresh from tableau server.
The idea is to have these information available to Initial SQL or Custom SQL. SO, that we can use them and set these variables
which will improve the performance of our extracts drastically.
example: for incremental what Tableau use to pull the data. That information should be provided to this variable say AAAAA.
If i am running incremental extract on 2/15/2017 then the value of AAAAA should be 2/14/2017 (Business data will be available only till yesterday in the database)
for Full extract what Tableau use to pull the data. That information should be provided to this variable say AAAAA. If i am running full extract on 2/15 and my PROD data
is available from say Jan 1 2017. Then the value of AAAAA should be 01/1/2017 till 2/14/2017.
We can use this variable in our query and get it work.
if anyone is having workarounds for this idea which i might have overlooked, Please let me know.