5 Replies Latest reply on Jun 30, 2011 1:48 PM by Richard Leeke

    Pull data out of Tableau Data Engine

    Joe Mako

      I really like the Tableau data engine, and it is only getting better with each release.

       

      Where do you stand on the ability to pull data out of the Tableau Data Engine?

       

      One of the reasons why I avoid some data warehouses is because they do not allow other applications to get at the data once it is loaded. You could see this as an advantage for the company, the data is locked in, and you have to use their tools to analyze the data or do any data transformations. Another viewpoint is that this is not middle-ware, and that there is no reason to connect other applications to data once it is in Tableau, that those other applications should connect to the data that Tableau connected to.

       

      Prior to 6.1 this was a situation I saw no real issue with, but now with incremental loading http://www.tableausoftware.com/about/blog/2011/06/can-data-ever-be-fast-fresh-enough-we-think-not-11797 , the Tableau Date Engine is one step closer to becoming like a data warehouse, and a .tde is no longer just a snapshot of some other data source. It could be said that the .tde could provide some of the benefits of a data warehouse for those who don't have a data warehouse.

       

      I think incremental loading is a great feature, but what happens then you want to take the data out of the .tde files?

       

      Will it always be best to keep a copies of the data both in its original form and a .tde, because if you load the data into .tde, and then delete the original data you cannot easily go back?

       

      From what I understand, you cannot write SQL to query the Tableau Data Engine, but I think it would be nice if there was a way to extract all data from a .tde file in one step.

        • 1. Re: Pull data out of Tableau Data Engine
          Richard Leeke

          I can see arguments both ways here, Joe.

           

          I certainly know whet you mean - there are circumstances where it would be very useful to be able to access the data in a TDE file in ways that Tableau doesn't currently allow.

           

          But I can also see that opening up access in that way could change the way that extracts would be used and that could potentially change the design constraints.  One of the things that contributes significantly to extract speed is that the designers were very clear about the nature of use of a TDE file (things like no update capability, so no need for locking or other concurrency control).  Likewise, there is no real need to worry about database recovery - and if a TDE were to become corrupt it can always just be regenerated.

           

          So I can imagine that changing the usage profile by making it become the definitive data store, rather than a derived copy, could change some of those original design constraints.

           

          My gut-feel is that the advantages of keeping it simple by maintaining the current closed access probably outweigh the advantages of opening it up.  And by the way, I was lobbying hard for opening it up throughout the technology preview phase for the data engine last year - but the wiser Tableau heads prevailed.

          • 2. Re: Pull data out of Tableau Data Engine
            Joe Mako

            Many applications keep data that is loaded into their application as a specialized format that is optimized for their purposes, with Tableau I see that as a .tde file. I agree and think it is great that the .tde file is limited in what it can to to enable it's core purpose to be more capable, no writing your only queries of it, no editing vales, and so on enable the read-only style to be faster and more focused.

             

            With the previous styles of .tde files, if the data changed, you had to reload the entire data source. I did not see the limitation of not being able to easily export from a .tde file as an issue, because you had to keep the original data if you wanted to append to the .tde.

             

            Now with incremental loading you do not need to keep your original data to append to a .tde file. I no longer see the creation of a .tde file as a just copy of the data, I now now see it as a generation of a data store where that combination of records can be unique to that file.

             

            Because I have to deal with data that is generated in different applications, I need software like http://www.stattransfer.com/stattransfer/formats.html that converts between different file formats, and ETL software with import steps for each vendor's data store type. I see this level of sophistication as more than what I think would be nice because a .tde file is focused on read only, I think it would be a nice addition if Tableau allowed for the mass export of the data in a .tde file.

             

            This way I can use Tableau to create an incramentaly loaded data store, use Tableau to analyze the data, and then export the underlying data for use somewhere else.

             

            The current features in Tableau work well for getting at the underlying data for thousands of rows at a time, but there is currently not an easy way to export all data when there is a larger amount stored in a .tde file.

             

            Where does Tableau stand on enabling users to access their data once it is loaded into a .tde file?

            • 3. Re: Pull data out of Tableau Data Engine
              Dimitri.B

              I am not into data warehousing that much, but here are my thoughts on the subject:

               

              - One of the main reasons we picked Tableau out of many others is the fact that it doesn't come attached to an "industrial strength" data warehouse. This means that we (business users) can keep IT happy because we are not intruding on their turf - IT does the data, we connect to it with our front end tools. A few other worthy solutions were rejected because they involved managing a DW, which means involving IT (in most corporate environtments) - a definite 'no go' for us. I personally like the lightweight nature of Tableau and the fact that the datastore is hidden away from the user and I dont' have to worry about it.

               

              - I think the main purpose of Tableau extract is to serve the front end, so Tableau developers have the freedom to do crazy things in the back end (no, I didn't mean what you just thought), and not worry about exposing a UI to users, providing API, etc. etc. - in other words maintaing all the "best practice DW management" functonality that Richard mentioned above. If Tableau does go that way and offer their own DW, I just hope it will be unbundled from front end (like Hyperion and Essbase, for example).

               

              On the other hand, if it is no extra effort for Tabeau to allow advanced users access to its built-in data store, and it is made clear that it is no DW (like tabcmd and tabadmin - they are documented but not for an average user) - then why not?

              • 4. Re: Pull data out of Tableau Data Engine
                James Baker

                I think... that the full implications of making "a data store where that combination of records can be unique to that file" are still percolating around here.  My personal opinion is that we don't *want* to become the "data store of record," and that extracts are still more a specialized, optimized *cache* than any sort of data warehouse.  It's dangerous, then, to start treating your cache like a substitute for or replacement of your store.  But with a persistent cache, I could see needs where emergencies happen and yes, it'd be nice to recover your data from your cache.  And incremental "refresh" really can function like incremental "addition", which is where I think trouble starts brewing.

                 

                All your thoughts here are appreciated and heard, and I hope that we start thinking inside of Tableau about tooling in *both* directions.

                • 5. Re: Pull data out of Tableau Data Engine
                  Richard Leeke

                  Well put, James, I think you captured the dilemma very well.