I can see arguments both ways here, Joe.
I certainly know whet you mean - there are circumstances where it would be very useful to be able to access the data in a TDE file in ways that Tableau doesn't currently allow.
But I can also see that opening up access in that way could change the way that extracts would be used and that could potentially change the design constraints. One of the things that contributes significantly to extract speed is that the designers were very clear about the nature of use of a TDE file (things like no update capability, so no need for locking or other concurrency control). Likewise, there is no real need to worry about database recovery - and if a TDE were to become corrupt it can always just be regenerated.
So I can imagine that changing the usage profile by making it become the definitive data store, rather than a derived copy, could change some of those original design constraints.
My gut-feel is that the advantages of keeping it simple by maintaining the current closed access probably outweigh the advantages of opening it up. And by the way, I was lobbying hard for opening it up throughout the technology preview phase for the data engine last year - but the wiser Tableau heads prevailed.
Many applications keep data that is loaded into their application as a specialized format that is optimized for their purposes, with Tableau I see that as a .tde file. I agree and think it is great that the .tde file is limited in what it can to to enable it's core purpose to be more capable, no writing your only queries of it, no editing vales, and so on enable the read-only style to be faster and more focused.
With the previous styles of .tde files, if the data changed, you had to reload the entire data source. I did not see the limitation of not being able to easily export from a .tde file as an issue, because you had to keep the original data if you wanted to append to the .tde.
Now with incremental loading you do not need to keep your original data to append to a .tde file. I no longer see the creation of a .tde file as a just copy of the data, I now now see it as a generation of a data store where that combination of records can be unique to that file.
Because I have to deal with data that is generated in different applications, I need software like http://www.stattransfer.com/stattransfer/formats.html that converts between different file formats, and ETL software with import steps for each vendor's data store type. I see this level of sophistication as more than what I think would be nice because a .tde file is focused on read only, I think it would be a nice addition if Tableau allowed for the mass export of the data in a .tde file.
This way I can use Tableau to create an incramentaly loaded data store, use Tableau to analyze the data, and then export the underlying data for use somewhere else.
The current features in Tableau work well for getting at the underlying data for thousands of rows at a time, but there is currently not an easy way to export all data when there is a larger amount stored in a .tde file.
Where does Tableau stand on enabling users to access their data once it is loaded into a .tde file?
I am not into data warehousing that much, but here are my thoughts on the subject:
- One of the main reasons we picked Tableau out of many others is the fact that it doesn't come attached to an "industrial strength" data warehouse. This means that we (business users) can keep IT happy because we are not intruding on their turf - IT does the data, we connect to it with our front end tools. A few other worthy solutions were rejected because they involved managing a DW, which means involving IT (in most corporate environtments) - a definite 'no go' for us. I personally like the lightweight nature of Tableau and the fact that the datastore is hidden away from the user and I dont' have to worry about it.
- I think the main purpose of Tableau extract is to serve the front end, so Tableau developers have the freedom to do crazy things in the back end (no, I didn't mean what you just thought), and not worry about exposing a UI to users, providing API, etc. etc. - in other words maintaing all the "best practice DW management" functonality that Richard mentioned above. If Tableau does go that way and offer their own DW, I just hope it will be unbundled from front end (like Hyperion and Essbase, for example).
On the other hand, if it is no extra effort for Tabeau to allow advanced users access to its built-in data store, and it is made clear that it is no DW (like tabcmd and tabadmin - they are documented but not for an average user) - then why not?
I think... that the full implications of making "a data store where that combination of records can be unique to that file" are still percolating around here. My personal opinion is that we don't *want* to become the "data store of record," and that extracts are still more a specialized, optimized *cache* than any sort of data warehouse. It's dangerous, then, to start treating your cache like a substitute for or replacement of your store. But with a persistent cache, I could see needs where emergencies happen and yes, it'd be nice to recover your data from your cache. And incremental "refresh" really can function like incremental "addition", which is where I think trouble starts brewing.
All your thoughts here are appreciated and heard, and I hope that we start thinking inside of Tableau about tooling in *both* directions.
Well put, James, I think you captured the dilemma very well.