1 of 1 people found this helpful
If it's just you then you could publish a data source and set the permissions on it for you only. For example, the saved data source I have on our server is the connection to the QA Tableau Server views and the permissions on it are for the "Admin" group only; all other users who are not in the "Admin" group can't use the published data source.
You may want to play with Groups and setting permissions on published data sources. Just a thought. You're right that you don't want to extract the same data source multiple times with the same data.
Thanks Toby. I understand the permissions and how to use a published data source. My main question is about maintaining multiple workbooks that reference that data source. When I publish a data source, and connect to it, it shows up as a "live" connection, and I have the option to extract. I am thinking that I should NOT do that, since the data source itself was an extract when I published it to server. This is what is confusing me--do I just connect live to the published data source, and set my refresh schedule on the data source itself, and NOT set a refresh on an extract at the workbook level?
Hopefully my question makes sense. I'm also curious what kind of problems folks have run into when revising workbooks so that they connect to published data sources rather than embedded workbook extracts, and if anyone has suggestions on the best way to approach this?
Anyone with experience using this method?
1 of 1 people found this helpful
Sorry Matt, I can't help but I'm interested in the answer
Matthew Lutton wrote:
... When I publish a data source, and connect to it, it shows up as a "live" connection, and I have the option to extract. I am thinking that I should NOT do that, since the data source itself was an extract when I published it to server. This is what is confusing me--do I just connect live to the published data source, and set my refresh schedule on the data source itself, and NOT set a refresh on an extract at the workbook level?
Wow, that IS an interesting question! I do not know but I think you are correct. I would assume the same thing.
Matthew Lutton wrote:
Hopefully my question makes sense. I'm also curious what kind of problems folks have run into when revising workbooks so that they cannot to published data sources, and if anyone has suggestions on the best way to approach this?
"...cannot to published..." sentence doesn't make sense.
Thanks Toby. I meant "connect" in place of "cannot". Fixed. I'm hoping Russell Christopher or Robert Morton, or someone else from Tableau can chime in here. The documentation on this is unclear (to me, at least), and like most things, seems to focus on very small, flat file data sources.
When I have experimented with publishing a data source, and I connect an existing workbook to it, and replace the original data source, it runs much slower (because its no longer an extract) AND my filters have to be rebuilt which is a huge annoyance (they're still there, but they default to multi-value lists).
My scheduled refreshes currently are mostly under 10 seconds, so even with two workbooks with identical extracts, we aren't talking huge performance issues. I'm more concerned about the long term impact of creating multiple workbooks with the same data source as an extract published with the workbook. What is best practice here?
I had planned to bring this issue to the conference, but I've had a bit of spare time to re-work some things so I wanted to experiment with this a bit before then.
1 of 1 people found this helpful
When you publish a data source to Tableau Server any associated files or extracts will be included. The resulting data source is known as a Data Server data source. In Tableau Desktop, a Data Server data source appears as a live connection for a good reason: you must remain connected to Tableau Server to continue sending queries through the Data Server data source. In fact, you can even create a local extract in Tableau Desktop in case you need offline access; the usual options for creating an extract still exist, including filtering, pre-aggregating, etc. This is entirely separate from any extract that powers the remote data source on Tableau Server.
The single extract on Tableau Server for a given Data Server data source will be shared between all connections that rely on that Data Server data source. You can configure a refresh schedule for that extract as you would for any extracts residing in a published workbook.
One of the current limitations of the Data Server is that it does not support the concept of a temporary table, which the Tableau query compiler prefers to utilize for very large or complex Filters, Sets, Ad-hoc Groups, and many Data Blending operations. Without temporary table support, Tableau must generate complex queries in the Data Server query language that can be expensive to evaluate. Consider the features I listed above whenever you are planning to centralize a data model using Data Server. And of course keep an eye on our product announcements and release notes, because we are always working to improve performance and make users more successful with powerful features like Data Server.
Thank you for taking time to address my questions Robert. The slow performance and the necessity to rebuild filters when replacing extracts with a Server data source makes me want to continue working the way I have been--even if duplicating extracts/data sources is the result. Or should I create an extract while working with the workbook(s) connected to the published data source, and simply remove the local extract when publishing?
And again, if I publish a workbook connected to a Server data source, I do NOT need to publish it with an embedded extract on a refresh schedule? I would simply set the schedule on the published data source--is this correct?
Thanks again. I'd like to use this feature, if it is going to help performance in the long run, but I don't really know if doing so would be any better than publishing two or three variations of the same workbook with an embedded extract.
One of the current limitations of the Data Server is that it does not support the concept of a temporary table, which the Tableau query compiler prefers to utilize for very large or complex Filters, Sets, Ad-hoc Groups, and many Data Blending operations. Without temporary table support, Tableau must generate complex queries in the Data Server query language that can be expensive to evaluate.
Robert Morton do the same limitations hold true for a data source which is "embedded" in a Tableau workbook vs. a Workbook leveraging a Data Server Data Source? Based on what you I think the answer is "No, an embedded data source (extract-based or live) can create temporary tables", but I want to be 100% sure.
Does anyone out there have actual experience sharing published data sources between workbooks? Would love to hear about those experiences, and whether its even worth doing.
Correct, you do not need to create a local extract of a Data Server data source when you publish the workbook to Server. The remote Data Server data source will hold the original extract, if one was created for that data source, and can be configured to refresh on a schedule on Tableau Server.
Regarding reasons to prefer Data Server, the biggest advantage is the ability to provide your colleagues with easy access to a rich data model. For the author of the Data Server data source, it is not quite so simple as you have pointed out (due to fallout from replacing a data source, investigating performance concerns, etc.). You may wish to keep with your prior approach of having many separate data sources with their own extracts, so long as your extracts are reasonably small (rough guess, <50MB) and reasonably quick to refresh on a scheduled basis.
Data Server does provide some unique benefits, one of which is its handling of Data Source Filters. Data source filters are authored in Tableau Desktop as filters that are embedded directly in the data source. After publishing the data source to Tableau Server, you can grant only 'query' permission (and not 'download') on the Data Server data source to ensure that users of that data source will be subject to your filters without knowing they exist or being able to modify (or remove) them. This may be useful for eliminating dirty data (out of sight, out of mind), or to enforce data security with a relative date filter or a User filter that Data Server will manage based on the identity of the logged-in Tableau Server user.
I hope this helps,
As a relatively new user of Tableau, I truly appreciate your contributions to this thread, and I want to thank you for taking the time to reply. I agree that publishing data sources is a very powerful feature for some organizations. I guess time will tell what approach we decide to utilize, and perhaps this feature will become more robust in future releases. It would be a huge benefit for me to have an easy way to upload extracts and point several workbooks to them while retaining fast performance in Desktop and Server as well as the benefits of working with an extract as an author in Desktop.
Glad to help! I hope you're planning to come to the Tableau Customer Conference, since it's a great opportunity to learn more about Tableau best practices and success stories from both Tableau users and Tableau employees.
I'll be there!
Just came across this interesting discussion and wanted to ask you what method you decided on - connecting your workbooks to a shared published data server vs duplicating the same connection in multiple workbooks?
I am facing the same challenge - though currently I have duplicated the same data source in 15 different workbooks. I am starting to notice that every morning when I login to server, at least one of the workbook refreshes has failed. The failure reason is a bit cryptic to me, but what is most strange is that when I run a manual refresh on server it works fine. I suspect that there are two many refreshes fighting for priority and eventually one of them gets the boot.
I would love to make the switch over to 1 published data server, have that refresh once and be done with it. It would also save me a lot of pain and anguish when it comes to having to reset the password in the data source every 90 days...which I currently have to do for every published workbook.
As you mentioned before, my biggest hesitation is that I'm sure switching to a shared data server is going to break my existing workbooks. Although they are all based on the same data source, there are many different calculated fields and user filters in each workbook. If you would be so kind as to share the pros and cons you've experienced here, that would be so helpful.