Same thing here. unoptimised = 900MB, optimised = 1100MB...
By optimize, I think Tableau primarily means optimize for speed not for size.
One way it does this is to store calculated fields as computed values in the extract. For example if you have a case statement that maps one dimensions to a category group, this calculated dimension could be stored in the extract as an additional column. ... So by optimizing, in most cases you're adding data to the extract. ...
Jim Thanks I just wanted to say the same thing after I found this:
"when working with extracts you can optimize the extract. Optimizing moves the calculated fields you’ve created into the extract so that they don’t have to be computed locally every time you use them when working with extracts you can optimize the extract. Optimizing moves the calculated fields you’ve created into the extract so that they don’t have to be computed locally every time you use them."
Jim maybe you can help me with this one:
I have one extract that is 4x bigger then the source database. It also has imported around 17.000.000 records while my source database has max 1.500.000. I don't understand how this is possible. I also removed my extract and created it again. With the same result.
Do you have any idea?
Is it really importing 10x the number of rows or is that just the number you see in the import dialog? I don't usually pay close attention, but want to say that the the message displayed when importing is not always 100% correct.
If you drop the # of Records field on to an empty worksheet it should match the number of rows in the database. Or you have a different problem. Tableau shouldn't be adding rows without some kind of SQL direction.
4x seems like quite a jump after optimizing, but you have 1.5 million rows, so if you're adding a bunch of calculated fields that can be materialized in the extract, I suppose it's possible.
I did not optimize the extract but if I open the extract it has included my calculated fields. I don't know how this is possible?
and it's the number of records I saw in the import dialog. But if I put the # of records field on an empty worksheet like you mentioned the output is also round the 17.000.000 records.
I use 13 tables that have each 100.000 records. Beside that I have 16 calculated fields.
I make reports all the time with the same kind of datasource. Other extracts are max round the 30MB. I really do not know what is going on...
I have removed my extract and also my calculated fields. Now I am going to extract the data again and see what the outcome is. I will let you know.
Without my calculated fields the extract has again 17.000.000 records...
I use 13 tables with max 100.000 records in each table..
The extract is over 1GB while my source database is less then 300MB (and I also use only a few tables from it)
Really confused. Do u have any idea?
Thanks in advance
Jim I found the problem.
I had a wrong join. The join was on a value instead of a dimension. This causes the 17.000.000 records.
Thanks for thinking with me.
If you're doing a join in Tableau, that might be the first place to start.
You can also see the underlying queries Tableau issues to the database by looking at the log files.
Glad you found it. ...