1 2 Previous Next 19 Replies Latest reply on Mar 9, 2015 5:20 PM by Chris Gerrard

    Recommended practices for sharing/re-using Tableau Data Extracts (.tde's)

    Jonathan Drummey

      I'm working with someone who has several complex workbooks all based on the same underlying source. Putting everything in one workbook isn't an option due to performance limitations (when workbooks get too many worksheets/calcs, they get slow). The source is a Tableau Data Extract based on a SQL Server query, due to load reasons a live connection to the SQL Server is not feasible. Tableau Desktop is in use, eventually Tableau Server will be in use.


      There's one workbook that is the master where the TDE is refreshed. The other workbooks then connect to the TDE. There are a few calculated fields that come across as Calculation_#########…. that need to be regenerated in the child workbooks each time. I've been able to narrow down two cases, one where a calculated field is simple enough to be materialized in the extract, another where the definitions of the calculated fields are different between the master workbook and another workbook, but I think the latter might be a red herring. I can reliably re-create the first problem, I haven't been able to re-create the second.


      I'd like to get rid of the Calculation_########… fields showing up every time, probably by using a different process, and I'm not sure what to do, I'm hoping someone has a better idea.


      Here's what I've seen:


      - There's no way for calculated fields created in child workbooks to migrate back to the master, all changes seem to have to be in the master. (Trying to do this by replacing the data source causes some problems, like losing Aliases.)

      - Fields created in the master workbook only migrate to children when they are materialized fields, and then they come across with Calculation_######

      - When the master data source is used to create a Saved Data Source (.tds) all the fields will migrate from the master, but this is a one-time only thing.

      - Tableau Server Published Data Sources do not seem to lose any fields, but my work with them in v7 made me wary because of various cases where calculated fields had different behavior than live connections or Tableau Data Extracts. And in any case Tableau Server is not yet available here.


      Any thoughts?



        • 1. Re: Best practices for sharing/re-using Tableau Data Extracts (.tde's)
          Richard Leeke

          Hi Jonathan


          Yes, I've done battle with the infernal Calculation_######## fields a few times. I feel your pain.


          I think the root of the problem is that the state information defining the contents of the extract doesn't seem to be wholly contained in the extract - some of the state is held in the workbook where the extract was created.


          As an aside, this has been the cause of various sorts of corruptions that can occur because the workbook isn't automatically saved whenever an extract is created or optimised. (Or at least it wasn't last time I looked, possibly that hole might have been plugged with v8). That can cause problems if you update an extract and then exit Tableau without saving the workbook (which may be involuntary!).


          Anyway, back to your question. The most reliable process I have found is instead of just saving the extract from the master workbook and connecting to it from another workbook, export a sheet referencing the extract from the master and open the bookmark to create your second workbook. That brings the entire connection definition in to the secondary workbook. Then if the connection subsequently changes in the master, refresh the extract, save the bookmark again and reimport the bookmark into the secondary workbook. Accept the option to replace the definition of the connection. It should come in with all its calculated fields intact.


          This approach actually allows you to migrate connection changes in either direction - just as long as there aren't changes both sides that you need to merge.


          I don't think this is pretty - it needs careful attention and I'm sure you can get in quite a tangle if you get it wrong - but it does save having to recreate calculations from scratch. One thing you have to be wary of is that you can't refresh or optimise the extract while a secondary workbook has it open.



          2 of 2 people found this helpful
          • 2. Re: Best practices for sharing/re-using Tableau Data Extracts (.tde's)
            Russell Christopher

            It's been a while since I've played with this, but as I recall, you'll see alias/caption in the data source definition that essentially says Calculation_######### = 'Friendly Name Whatever". It sounds like this info isn't making it to "all" your workbooks.


            What happens if:


            1. You refresh your TDE
            2. You save the Data Source as a TDS
            3. You bring the new TDS into the "other" workbooks and use it to replace the original data source?


            That would ensure the extra metadata gets in there.

            2 of 2 people found this helpful
            • 3. Re: Best practices for sharing/re-using Tableau Data Extracts (.tde's)
              Jonathan Drummey

              Thanks for both of these suggestions, I'll try them out later today and get back to you on what I find!

              • 4. Re: Best practices for sharing/re-using Tableau Data Extracts (.tde's)
                Richard Leeke

                Russell's way is better - I'll be doing that in future. Can't believe I didn't think of it and took the round-about route with bookmarks...

                2 of 2 people found this helpful
                • 5. Re: Best practices for sharing/re-using Tableau Data Extracts (.tde's)
                  Jonathan Drummey

                  Russell and Richard, thanks for your help! Both your instructions work and have the same effect on the workbooks as far as I can tell. Here's what I was testing for:


                  - update extract w/new data

                  - create an alias

                  - change an alias (in parent and child workbooks)

                  - create a materializable calc field (in parent and child)

                  - create a non-materializable calc field (in parent and child)

                  - create a parameter (in parent and child)

                  - assign and change custom colors (in parent and child) (I'm presuming that Size and Shape assignments will work similarly)


                  What I found was that when using both the bookmark and the save-Tableau-data-source-and-replace-it techniques:


                  - Data is updated as expected.

                  - There are no more Calculation_######…. (they will temporarily exist with the save Tableau data source technique until the data source is fully replaced).

                  - Calculated fields, aliases, and color assignments that only exist in the child are not transferred back to the parent. The calculated fields can be copied, subject to Tableau's (non-functional) behaviors around copying nested calculated fields. Aliases would have to be rebuilt.

                  - Parameters that only exist in the parent are not transferred (but can easily be copied).


                  The difference between the two seems to be that the bookmark is a "TDS with a view." The one key to Russell's instructions is that Tableau doesn't indicate whether you are saving a TDS for the original connection or the TDE, how Tableau figures it out is whether the extract is active or not.


                  Here are some more instructions, for completing this thread and for my own documentation needs:


                  1. Get your parent workbook ready and generate the extract. Verify that the extract is active.

                  2. Save the data source (TDS).

                  3. Close the parent workbook (to make sure there aren't any changes that didn't make it into the TDS).

                  3. Open the other workbook.

                  4. Connect to the data source.


                  If there are changes to be made to calculated fields, color/size/shape assignments, or aliases, the changes could be made in the parent or the child. Either way, to propagate the changes the next steps are:


                  1. Save the data source (we'll call this TDS v2).

                  2. Close the current workbook.

                  3. Open the other workbook.

                  4. Connect to the data source. If you've saved the TDS v2 with the same name as the original, then Tableau will prompt you as to whether or not you want to immediately replace the data source. If you choose not to, then you can always replace the data source later. This can be helpful if you want to do some verification that there aren't any issues (for example, if someone had made changes to the data source that hadn't been propagated).


                  [edited 20130703 - simplified based on comments from Richard]

                  2 of 2 people found this helpful
                  • 6. Re: Best practices for sharing/re-using Tableau Data Extracts (.tde's)
                    Richard Leeke

                    When I tried Russell's method I imported the data source into the target using File->Open (not sure if that is what you did). At that point Tableau prompted to say that the data source name already existed and gave me the same three options as you get on opening a bookmark. I just chose to replace the datasource at that point, so I never ended up with 2 copies in the target workbook. Downside is you can't manually compare. Upside is you don't have to do another step to replace.

                    1 of 1 people found this helpful
                    • 7. Re: Best practices for sharing/re-using Tableau Data Extracts (.tde's)
                      Richard Leeke

                      And by the way, I just realised that I've posted several times on a thread called "Best Practices" without trotting out my anti Best Practice mantra. I must be going soft.


                      (For the uninitiated, I contend that the term "Best Practice" stifles innovation and creativity by implying that we have the 'best' answer already, rather than admitting that there can always be a 'better'. I'm quite happy with "Good Practice" and "Bad Practice" but "Best Practice" usually sets me off on a rant. Like this.)

                      1 of 1 people found this helpful
                      • 8. Re: Recommended practices for sharing/re-using Tableau Data Extracts (.tde's)
                        Jonathan Drummey

                        Ha! I know that rant, and have used it myself on occasion, and I think I was so focused on trying to get help that I let that slip by. I've changed the thread title to "Recommended practices…"


                        I figured out what was going on with the "extra" step, in my testing I'd been careful to give the saved data source a new name each time, so when it was brought in Tableau wasn't seeing that it was the "same" thing. If the TDS has the same name, then we get the prompt. I'll update my earlier post.


                        This also makes me think that a bookmark is a sort of "TDS with a view."



                        • 9. Re: Recommended practices for sharing/re-using Tableau Data Extracts (.tde's)
                          Richard Leeke

                          You are forgiven. Just don't let it happen again.

                          • 10. Re: Recommended practices for sharing/re-using Tableau Data Extracts (.tde's)
                            Shawn Wallwork

                            Usually it's a whole LOT more rant-y than that! You're going soft my friend.



                            • 11. Re: Recommended practices for sharing/re-using Tableau Data Extracts (.tde's)
                              Jay Chang

                              I'm sorry to bring up an old (and possibly painful) topic but I'm struggling to understand the interaction of a single extract with multiple workbooks.


                              Here's my situation.  I created a single tableau extract using some fairly involved SQL.  Then, following the steps in this excellent post, I created a number of customizations within T8 - groupings, calculated fields, aliases, etc.  My goal is to use this extract as a starting source for a number of different visualizations.  What I'm struggling to understand are the details of how the work I do in the various vizes impacts the extract.  Specifically:


                              • If I create a context filter in one workbook/viz, does that in any way impact any of the other workbooks/vizes that use the extract?  My testing seems to imply that I can do this but I am not 100% sure.
                              • If I create a data connection filter for one viz, will that impact the extract/other vizes?  Again, testing seems to imply that this is not an issue but it doesn't make sense to me why this is not impacting the extract itself.
                              • If I create an alias in one workbook, it does not seem to carry over into other workbooks but it also seems to cause the dreaded "Calculation_XXXX" error, seemingly at random.  I should not expect these workbook based aliases to ripple through to the source extract, correct?
                              • How do I save an extract under another name without having to re-create all the work that went into that original extract?  There doesn't seem to be any way to do this in Tableau.
                              • If I create additional calc fields/groupings in one viz using the extract, will the other vizes using that extract see these changes/additions?

                              At one point, I was doing the whole "create master workbook to manipulate the extract, save the extract as a TDS, close the master workbook, open the child workbook, replace the data connection" sequence.  But even then some of the changes I'd made to the master were not flowing through (default sort orders and changes to groupings being the most commonly missed).  And there were times when my calculations would disappear, replaced by the dreaded "Calculation_######" error.  So I kind of fell out of this rigor, and now I'm sort of paying the price.

                              If anyone can point me to a more detailed explanation of how Tableau handles data extracts, context filters, etc, I'd be most grateful.  This thread is the closest I've come to getting a coherent set of answers.

                              • 12. Re: Recommended practices for sharing/re-using Tableau Data Extracts (.tde's)
                                Jonathan Drummey

                                Hi Jay,


                                I understand your struggle, this thread came out of my own attempts to understand this part of Tableau better after having used extracts just like you and running into assorted issues.


                                - Context filters are particular to the worksheet, unless you make them scoped or global filters, in which case they will apply to multiple worksheets or the entire data source. There is an order of precedence for filters. Data source filters (from the data source context menu) are added to the WHERE clause of every query, so they effectively have the highest precedence. Context filters create a temporary table for the rest of the filters & view to be built against, so they have the next highest precedence. Other filters on dimensions and aggregates are applied after that, with table calc filters at the end. Note that because of the overhead in creating a context, context filters can slow down a view unless they are really reducing the amount of data that the data source & Tableau are using.


                                - I'm not sure what you mean by a "data connection filter". Do you mean a data source filter? If so, that affects every query for the data source, and by default is an option when generating an extract (though you can remove it from extract creation if you want).


                                - Aliases (and color assignments, etc.) go with the data source, not the extract. What this whole thread helped shape for me is that extracts are really more like a database table, Excel file, etc. The metadata goes with the data source, which lives in the Tableau workbook unless you export that via a saved data source or a Bookmark.


                                - The Calculation_###### shows up when you have a field that can be materialized in the extract. Once the extract is refreshed and that calc is materialized, it will appear with the Calculation_##### in the child workbook that is pointing directly at the extract. The way to avoid this is to use a Tableau saved data source. (And this reminds me to test this in v8.1, the new copy/paste calcs functionality might also get rid of this error since Tableau is smarter about the Calculation_#####).


                                My suggestion to get the results you want is to use a Tableau saved data source.



                                • 13. Re: Recommended practices for sharing/re-using Tableau Data Extracts (.tde's)
                                  Jay Chang

                                  Jonathan, thanks for your response.  Some additional clarifications, if you would not mind:


                                  • If I am planning to create a single Extract to use as a Data Source, should I use Data Source Filters?  Based on  your explanation, it seems that I should have the extract be as expansive as possible because various vizes might need different subsets of the data and the DSF will restrict all the data coming into the extract.  Or am I not understanding that correctly?
                                  • As a subset of this question, if I create a viz off of the saved Data Source and then create a DSF within this viz, when I refresh the extract from this viz, the DSF should apply in this case correct?
                                  • If I create a global Context filter, can I assume that Tableau will take the full set of data in the extract and then create a Temp table version with the selected context, against which all other viz queries will be based?  In such a case would there ever be a reason for the viz to return to the full (non-contexted) data extract?  Again, I'm guessing no.
                                  • If I follow your approach outlined above, would I do data extract refreshes from the workbook that I'm using to maintain the data source, or can I do it from any workbook that is using the saved data source?
                                  • 14. Re: Recommended practices for sharing/re-using Tableau Data Extracts (.tde's)
                                    Jonathan Drummey

                                    1. I don't know the size of your data, time to extract creation, etc. so I can't give you a definitive answer.


                                    2. Extract filters are separate from data source filters (though data source filters are by default added as extract filters when you first create an extract). I haven't worked with data source filters and saved data sources enough to say what would happen when under what circumstances. If you test this out, please post back with your findings!


                                    3. Yes, that is how context filters work. Depending on your data source and caching, Tableau may or may not need to go back to the data source in a given session.


                                    4. Given my lack of knowledge re: #2, I'm not sure how data source filters in different workbooks might interact with the extract refreshes. It's something you'd have to test out.



                                    1 2 Previous Next