When you set up your incremental refresh it only brings in New data. Will not refresh any older data that may have been changed.
When setting up this incremental refresh you can define how new data is defined either by a field - order number or a date or other defined field.
You may want to add a incremental refresh to bring in new data on a more regular basis and a full refresh on a weekly monthly basis to refresh data so any changes in older data are synchronised
Let me know if you have any further questions
how to create multiple .tde files(extracts) for single data source.?
We have data source containing data of last decade and we want to save those data year wise. So is there any option or way by which we can create multiple .tde files(extracts) on same data source with each extract contain data of 1 year.. and even if we managed to do so then can we interlinked those extracts so that we can put queries on multiple extracts simultaneously. And how to do that?
1 of 1 people found this helpful
That's a tough question.
You can create multiple extracts on a single data source but it would not be an automated process.
You will need to (in Tableau Desktop) change data extract filters each time you refresh datasource and publish
As to linking these what you will need to do is a union. Currenlty if I can remember correctly Unions only work on Flat files csv excel etc.
I do remember reading something about introduction of cross datasource joins within tableau 10 beta that might allow doing unions across other data not only flat files.
but this is a long shot and weather Tableau 10 will solve your issue when It comes out of Beta can not be answered yet.
From what I understand your data source is quite large and what you would like to do is limit the amount of data extracted.
Is this because tableau desktop is crashing when doing a full data refresh or because users don't really need all the data and you would like to improve performance of the report?
If desktop is crashing you will need to create a empty extract publish to server and refresh the extract on the server
If this is to limit data in report you may want to look at using parameters within a custom SQL query pulling your data. This way the user can define how many years of data they would like to analyse.
Hope this helps.
Please mark any answers that do as helpful.
Hello Khalid norat,
Thanks a lot. actually I am newbie in tableau so I came across such dynamic and complex scenario. we are working with tableau desktop.
How to take union of two tables because I was searching about data blending then it showed me regarding common column and all. So is it regarding joining two tables..?
but we want union of tables/extract(what we want is that we want to create extract of data of every year like 2013,14,15 etc which we would keep seprate and only for current year's data we will take complete refresh. but in case if we would need to combine those extracts then we should be able to combine those extracts or we should be able to put query on those extract
for eg. if we need to consider data of 2014 and 2015 then we should be able to combine those extracts).
So I am getting bit confused about how to create design structure.
If you are not using the data for the previous years for reporting than what is the need of a data extract.
If you are only reporting on current year data and possibly require additional years intermittently then the parameters option in custom sql is your best option.
If however you still require all those extracts there is no current way of doing a union of all those data extracts and therefore your requirement to merge these data sources will not be resolved.
What are the actual requirements that you are trying to work towards and why are these the requirements.
I think you may need to have a discussion with the stake holders to understand what they exactly need and to make them understand it is not required to keep extracts of data on tableau server if they will rarely be used.
I am finding it difficult to understand what and why you are trying to achieve. Please can you elaborate and I will try and advise you the best way possible.
OK.. Let me tell you whole scenario.
we have data of last 10-15 years on which we are suppose to run our reports and for which we are making use of tableau.
we have to consider historical data(If in case customer wants to see data of last 10 years).
But each time if we load whole table with historical as well as current data( having 1 cr records at present and this size it will keep on increasing in future) then it will take so much time to load data and performance may get compromise. And refresh time will get increase as data volume increase.
so we were thinking to maintain historical data separately in some data source(say d1 containing data for year 1998 to 2013) because it won't need much in current processing.
and data of current year in separate data source(say d2 containing data for year 2014 to 2016 ). we will keep incremental refresh on d2 and also full refresh on d2 on weekly or monthly basis in order to get updated records also.
On dashboard one date filter is suppose to be provided so that client can have look over data for variable date range.
But if client wants to run graph on data of "last 10 years" as well as "current year data" then we will have to merge both data sources(d1 and d2) so we would be requiring union of two data sources d1 and d2 and then we will have to compute query depending on date range.
I searched on internet and I came across concept named data blending which works on join functionality where in our case we would require union functionality.
So this is the scenario I wanted to convey but maybe I could not able to.
2 of 2 people found this helpful
I better understand your situation now.
There are may possible ways to approach this.
1 thing I would try is having 2 data sources. both ranging the full time span.
1st one should be an aggregated version of the data. Something that can answer the high level KPI's Metrics.
This is normally what most stakeholders need to see.
The second should be a row level data source.
The report should be set up so the row level data source is only used when the user has exhausted all other drill down and the only drill down left is Row Level data.
If you are having trouble pulling the data into an extract you may want to look at the following link which explains how to create empty extracts and let the server do the work of refreshing the data source rather than tableau desktop
Let me know how you get on and if there is any thing else I could assist in.
Please mark any answers you find useful as helpful.
yes.... thanks Khalid... I will try this and will get back to you as soon as possible with hopefully positive results...
your suggestions were really very helpful. I again have one query. I have data source with 15 columns and each column contains some null values and I want to replace those null values with blank(" ") . I have searched on internet and I found solution which was saying that create calculated field and in that replace null values with blank. but this was regarding single field.
I want to replace null values of 15 columns with blank in one attempt . so is there any way to do so..?
See this idea on tableau. currently this is not possible but has been suggested.
Although you do not need to create calculated fields for all columns you could get away with just changing the aliases
I am in the exact same situation and really need help . I ll explain my context below :
We have a very big published data source ( billions of rows in the original table and the size of the tde is over 30G) living in tableau server.
The problematic is that the source table can get either updated rows or insrted rows, so and incremental refresh wont work.
A full refersh wont work either because of the size ( timout on tableau server even after resizing the window to over 7200 sec , and tableau desktop crashes).
What is worth mentioning here , is that the original table has over 85% of it as historical data and will never ever change.
The idea i got is to buils two tables each with historical and non historical data and then incrementally refersh the first and fully refresh the second.
My problem is how to join these two published data sources ? I need both in the report and need to get daily update
Hope this is clear !
MANNNNY thanks in advance for your precious time !
one option may be to follow guidance provided within this blog. It was written a while ago, but is still applicable. Let me know if you want to dive deep into this as I have further developed this, but haven't blogged about it yet.
The high-level concept is:
1. Keep a static immutable copy of the data up through a certain date (i.e. 90 days ago). Set it up to have the "incremental extract" option
2. As a starting point, have a script that each night does a "tabcmd get" then does a "tabcmd publish" to an incremental version of the extract
3. Perform an incremental refresh of the incremental version. This permits refreshing the data with a starting point of 90+ days ago
4. Point your dashboards at the incremental version
First, thanks soo much for your quick reply ! But I am sorry , i am not sure i got your instrcutions ??
I woudl be very happy to dive deep into this as i spent now a loong time on it and sounds liek i really need help
But first, do you think teh way i was thinking to get it done is not possible ? I mean , there is no way to combine two published data sources into one third published data sources that gets updates after each of teh two are is updated ? ( one incrementally ( the immutable) and the second fully ( the mutable rows, before they go to their final state and become immutable)
there is no way to dynamically combine published extracts together at run time.
Here's a screenshot example of the published datasources.
The -immutable version has a date lag of 90 days (i.e. October 1). So it has 120 million rows. And it's used as a starting point for the incremental refresh, so that the refresh starts as of October 1 forward. In order to have "-immutable" as the starting point (for the refresh), we have a script that takes a copy of the "-immutable" and publishes with overwrite to "-incremental".
The dashboard points at the "-incremental" version.