1 2 3 Previous Next 31 Replies Latest reply on Mar 30, 2016 11:15 PM by Sasiaknth Reddy Go to original post
      • 15. Re: data extracts
        Iryna Schcur

        SQL SERVER

        • 16. Re: data extracts
          Shawn Wallwork


          • 17. Re: data extracts
            Jonathan Drummey

            I've been following this thread for a bit and wanted to clear some things up w/regards to what happens when. This is definitely a confusing subject, here's what I've worked out:


            Here are the major types of Tableau files that we work with in Desktop:


            • .twb - an XML file containing the metadata for a Tableau workbook that includes information about the connections to the data source(s), what worksheets and dashboards are in the workbook, calculated fields created in Tableau, default color assignments, Tableau parameters, etc.
            • .tde - A Tableau Data Extract that has the data from a data source stored in a columnar, highly compressed format.
            • .tds - A saved data source. This has the connection information, calculated fields, color assignments, etc. that we might find in a .twb, only stored as a separate file so it's re-usable.
            • .twbx - a zip file (with a different extension) that stores the .twb and all data sources (not all data sources can be exported into a .twbx, for example cubes).
            • .tdsx - a zip file containing a .tds and the contents of the data source. I know this is possible, I've never seen one actually used.


            When we open up a .twb file in Tableau Desktop, Tableau reads the connection information and connects to the data source(s). When we open up a .twbx file, Tableau first unzips the .twbx (which includes the .twb and the data) into the user's temp folder (the location depends on the OS and the user's profile) and then works with that in place.


            When we're extracting data, there are a few different scenarios depending on whether the Tableau workbook is a .twb or .twbx.

            1. If we have a .twb (or haven't saved a workbook yet), Tableau will prompt us to save the extract as a file.
            2. For a .twbx, if we use the "Import all data" / "Import some data" options when we make the connection, Tableau will prompt us to save the extract as a file. If we connect live then in the Data window right-click on the data source and choose Extract Data..., the extract is saved *inside* the .twbx.

            Whatever way, the workbook also includes the connection information back to the live data source so the extract can be refreshed.


            When we save a .twb, the .twb has the connection info but not the data, so all the data lives separately. When we save a .twbx, if there's a live connection then Tableau will attempt to save that data source *inside* the .twbx (not all data sources can be saved inside a .twbx, for example cube sources), which effectively makes the data source static.


            Therefore, when it comes to data connections, I have several classifications:

            • live connection to the data source
            • a static data source in a .twbx, created by saving a workbook with a live connection as a .twbx.
            • an extract that is made from a live connection to the data source. This gets the double-cylinder icon.
            • a "naked" extract - we can directly connect to a .tde file as a data source. This gets a single-cylinder icon and is a variation on the live connection, only it's really a static connection because the raw .tde doesn't have the information needed us to trigger a refresh. I don't really recommend this unless you are very sure of what you're doing, if the .tde changes that can case workbooks built on the naked extract to start throwing up errors, showing [Calculation_2987347315671] instead of your actual field names, etc. Use a .tds instead (a bit more about that below).
            • connection to a Tableau Server Published Data Source - this has four subtypes, the first are variations of three types mentioned above - a passthrough live connection to the data source, an extract made from a live connection, or a naked extract, and the fourth is a "static" connection to the data source (which is what we get from a  live connection to a data source when we leave the "Include External Files" option to it's default of being checked, or make a connection to a static source where we Include External Files, there's a failure mode in this that I'll talk about further down in publishing).


            An aside here: One way to think of .tds files is that they "wrap" any of the above connection types in metadata to make the connections re-usable, so instead of making the connection to the raw Excel file, SQL Server, Tableau Saved Data Source, etc. we "connect" to the .tds.


            There's a wrinkle in the above that some users run into:


            - Make a workbook

            - Connect live to some data.

            - Save it as a .twbx. Tableau stores a copy of the data in the .twbx, thereby making it static.

            - Open up the .twbx.

            - Try to refresh the data, nothing happens. Grrr. Argh.


            Here's another wrinkle:

            - Make a workbook

            - Connect live to some data.

            - Save it as a .twbx. Tableau stores a copy of the data in the .twbx, thereby making it static.

            - Create an extract of the data. Tableau puts the extract *of the static data* in the .twbx.

            - Open up the .twbx.

            - Try to refresh the data, nothing happens. Grrr. Argh.


            Here's a way that avoids either wrinkle:

            - Connect live to some data.

            - Extract the data.

            - *Then* save the extract.

            - Try to refresh the extract, and it will refresh.


            If a user publishes one of the first two kinds of workbooks to Tableau Server, they might drive themselves crazy trying to figure out why their Tableau Server view won't update. (I spent a few hours on this when I first used Server. GRRR. ARGH.).


            Ok, let's talk about Tableau Server now:


            Whether the workbook is a .twb or .twbx, Tableau always publishes the metadata (the XML file part as well as connection information) to Tableau Server. Whether the data is included depends on the kind of connection:


            • When we publish a workbook to Tableau and the workbook has a live connection (including a connection to a naked extract), we get the "Include External Files" option. This default publishes a *copy* of the live connection into Tableau Server (more on that in a minute), thereby making it a static data source. When we turn that option off, then Tableau Server won't copy the data and will attempt to use that live connection. (Which is why we need to be careful about file paths (use UNC to be safe) and permissions).
            • If the workbook has a static data source (as in publishing a static file inside a .twbx), we will also get the "Include External Files" option. If we leave the default on, that static data is copied to Tableau Server. If we uncheck that option, Tableau Server won't copy the data and will attempt to use a live connection back to the source, only the source only lives inside the temp file created on your hard drive when Tableau unzipped the .twbx so Tableau Server is almost certainly not going to be able to connect to it and you'll get "An unexpected error occurred on the server." message after publishing.
            • When we publish a workbook to Tableau that uses extracts made from a live connection, Tableau publishes the extract into Tableau Server, and preserves the information about how to get back to the live connection so the extract can be refreshed. Plus we get the Scheduling & Authentication option. (So even if you've used a .twb and stored the .tde separately, once you've published it Tableau Server ignores the .tde, instead using Tableau Server's own internal copy).
            • A workbook using a Tableau Server Published Data Source connection is a special variation of a live connection that is always live and doesn't get the "Include External Files" option. (Even though the Tableau Server Published Data Source might be totally static).


            If we're using multiple data sources in a workbook, we'll see an appropriate combination of the above options when publishing the workbook. Wee!


            When we're creating a Tableau Saved Data Source, the process is the same as when we publish a .twb or .twbx.


            When I wrote "publish into Tableau Server", what happens is Tableau puts *all* the information into it's PostgreSQL database. The XML, the extract, and static files that were put in by "Include External Files", etc. Some people have gone into the PostgreSQL tables to do hacky-things, and that's way into undocumented/break your warranty territory. While a Tableau Server user is interacting with a workbook we can find some temp files with data, but those get blown away as soon as the interaction is over and Tableau Server does its cleanup.


            There's more that could be written about permissions, refreshes on Tableau Server, making and editing .tds files, and what happens when we download a workbook to edit or try to edit a Tableau Saved Data Source. I hope this answers the outstanding questions on this thread!



            14 of 14 people found this helpful
            • 18. Re: data extracts
              Matt Lutton

              This helps me confirm what I've been doing is appropriate for my use case, so I'm grateful for the clarity you've added here.  This will surely become a reference I come back to. Thanks Jonathan!

              • 19. Re: data extracts
                Shawn Wallwork

                By the way Jonathan thanks for yet another detailed explanation of something Tableau. I'm currently running through your various scenarios just to confirm all you've said (and to hopefully get them to stick in my memory). I'll let you know.





                • 20. Re: data extracts
                  Matt Lutton

                  I'll throw this resource out there as well, as I just discovered it and found it helpful -- it briefly mentions the TBM (Tableau Bookmark) file type as well, which is evidently also an unpackaged XML shell, like a TWB.  Not as much detail as Jonathan's awesome contribution above, but I figured someone might find it useful.



                  • 21. Re: data extracts
                    Shawn Wallwork

                    Matthew this is excellent! Thanks for sharing.



                    • 22. Re: data extracts
                      Irfan Mohammed

                      Hi Jonathan


                          Your post is really helpful. I would appreciate if you could confirm the approach that i am following as below


                      I have 5 Tableau workbooks using the same data source. I want to centralize this data source and schedule it to refresh the extract so that it would update data in all my 5 workbooks. currently i have created my workbooks based on local tdsx.


                      1.publish this extract based TDS to tableau server (tdsx)

                      2.open the workbook containing the local tds

                      3.import the tableau server tdsx

                      4.replace the local tdsx with tableau server tdsx

                      5.publish the tableau workbook (twb) to the server

                      6.schedule TDS to refresh the extract


                      in 1, do I need to publish only the TDS file? (or tde or tdsx) . I have understood that I need to publish tdsx file here.

                      in 5, while publishing the workbook, do I need to publish only the twb or twbx?


                      I want to make sure if this approach would centralize my data source and when i schedule it, it would update data in all my 5 workbooks.


                      one more question i have is, when i import my published tdsx file, it doesnt show the option 'use extract' checked. Not sure if it has saved the extract from the tdsx on the server and if its meant to be that way.



                      • 23. Re: data extracts
                        Jonathan Drummey

                        Hi Irfan, I've responded to your same question on the blog post I built out of this particular thread, see O Extract, Where Art Thou? | Drawing with Numbers.

                        1 of 1 people found this helpful
                        • 24. Re: data extracts
                          James Campbell

                          I've recently been working on some fairly large extracts which have had v slow refreshes so have been sharing TDEs between different workbooks and have noticed some "unexpected" (to me) behaviour.  Basically I can only access the edit data source screen from the workbook that originating the TDE.  I can refresh it from any TWB but not change its definition.


                          Feedback from Tableau tech support is this is "designed behaviour" which I'm so of fine with (maybe an advantage in protecting the TDE design in a desktop only setup) but annoyingly if you do try and edit the data source in a different TWB it doesnt throw an error or msgbox, it just does nothing basically.  It also seems to be an undocumented feature (I stand to be corrected on this bit, just I cant find it)

                          • 25. Re: data extracts
                            kushwant singh

                            Please advise :


                            I created a Dashboard using a TDE extract and hosted it in the server. This TDE file comes from alteryx and will come every month with new set of data. My dashboard has few small calculations.

                            Now when I changed the TDE file, the dashboard shows errors in the calculated field and the data doesnt refresh.


                            Please advise what should I do to refresh the data and to keep the calculated fields intact.



                            K singh

                            • 26. Re: data extracts
                              Justin Emerick

                              Hello K,


                              Now, I assume what you are doing is this.


                              You open a workbook that has a data source (which is an extract).

                              You then connect to the NEW extract in the workbook.

                              You switch the old data source with the new.


                              When switching data sources this way, you'll get errors in calculations. A couple things you can do to avoid this.


                              1. Replace the existing datasource: If the extract is on Tableau Server, published the new, refreshed extract with the same name as the old. If the data source is saved locally on your computer, find the directory and replace the new with the old.

                              2. Edit the XML: Tableau workbooks (.twb) files are just XML. Open the workbook with a text editor. You'll see tags for <datasource> in the document and you can perform a find and replace to switch out the old data source with the new.

                              (I highly recommend you make a backup before doing this!) If the file is saved as a packaged workbook (.twbx) - make sure you unzip the file first. (generally, I rename the .twbx extension to .zip and then use the Windows to "extract", this will create a new folder in the directory that has the .twb file and the .tde file.)


                              Hope that helps!



                              • 27. Re: data extracts
                                kushwant singh

                                Thanks a lot Justin.

                                Appreciate your help!

                                • 28. Re: data extracts
                                  Rawan Alsiekh



                                  Thank you so much for the explanation. I have a question though. I created a workbook that has multiple SQL extracts, and two connections to excel sheets (that are stored in a file where Tableau server can access). so when I published the workbook, I unchecked the "Include External Files", because I don't want a static copy of the excel sheet, I want it to connect live to it, and update automatically. The question now, how would this affect the other SQL extracts? will tableau be connecting to them live as well? I only want the workbook to be refreshed monthly, and I did set the refresh schedule to monthly, I checked it out, and I saw one of the SQL extracts has up to date data! can you please help me understanding why would this happen, what is it that I am doing wrong?


                                  again, I only want it to refresh monthly, and this is applied on the excel sheet connections as well...


                                  your help is highly appreciated..



                                  • 29. Re: data extracts
                                    Jonathan Drummey

                                    I just tried to duplicate what you described in v8.3 as best I can and I can't replicate the behavior. Here's what I did:


                                    0) started Tableau Desktop

                                    1) connected to a live Excel file & built a worksheet based off it.

                                    2) connected to a SQL Server source that has regularly updated data, created an extract, and built a worksheet.

                                    3) published the workbook to Tableau Server with Include External Files turned off.

                                    4) reviewed the worksheets on Tableau Server

                                    5) made a change to the Excel file and saved it.

                                    6) refreshed the Excel view on Tableau Server and saw the change, this is expected.

                                    7) waited for a change on the SQL Server

                                    8) refreshed the SQL Server extract view and saw no change, this is also expected because I was using the extract.

                                    9) started a new Tableau Desktop instance, downloaded the workbook published in step 3 to verify that the SQL Server connection used an extract, it did as expected.


                                    This is all the behavior that I've seen in Tableau since I started using Tableau Server. The only things that I can think of that would replicate the behavior you describe are:


                                    a) The database extract was refreshed at a time that you weren't expecting.

                                    b) You'd unclicked "Use Extract" on the database extract so it was actually a live connection.


                                    I'm also not clear on your last paragraph, it seems like you want the live connection to the Excel data to only update monthly. In that case you can create an extract of that Excel data as well then publish the workbook, however there are some subtleties, see http://drawingwithnumbers.artisart.org/o-extract-where-art-thou/ for details.