Incremental is indeed incremental, and great if you're connecting to transaction data.
Are you connecting using Tableau's data designer interface, or a custom SQL query?
Edit to my above post - all of the research I've done has pointed me towards incremental is NOT indeed incremental from the database perspective. As I read;
- Tableau sends the entire query to the database
- Tableau then uses your date field to select rows from the results it queried from the data source
- Tableau then unions your existing data with your incremental results
- Tableau loads new unioned .tde into server
From the database perspective, you likely won't see an increase in performance. From Tableau's perspective, you don't have to regenerate and load an entire extract file for every refresh.
Hmm... that's frustrating. I can fully understand the technical limitations in not doing it the other way around. Is there a way to perhaps use a parameter to help "guide" Tableau? Parameters are allowed in queries within a workbook, but what about parameters for published Server data sources?
I guess my best advice would be to stage the data.
Create a table, set up a task scheduler to drop and re-run the table daily. Each day, Tableau would hit this table, see the date field with fresh dates, and import the entire table. That would make it incremental from both the database perspective and Tableau's perspective. To load the entire table first, change your date parameter from X past date to today, and then set up the task scheduler to load yesterday's data in every day.
I was confused on how incremental refreshes actually work. I have a 50M row Tableau published extract sitting on the Tableau server and want to incremental refresh from a SQL server based on an added datetime field. It seems the process is refreshing, buts its taking ~16 minutes regardless of how many "new" rows I have. If I have 100 or 50,000, about same runtime.
I think as you note, its running the entire query then determining what to "add". This seems counter intuitive based on the configuration. One would think Tableau would send the max value to the database since it asked us to configure the incremental column, it can do this with parameters. It's also not scalable, as my source SQL table grows, its going to take longer and longer to perform an incremental refresh.
Your idea might be the only way to solve the issue. Basically only output "new" rows of data to my SQL table that I want Tableau to refresh from.
I was considering using an append API option with TABCMD to my existing Tableau extract. Any thoughts on this option?
Does any one where to access the "extract history" window mentioned in the below link? Is it via Desktop or server, I couldn't see row counts anywhere.
Any Tableau developers want to add insight into extracts?