I'm hoping to gather thoughts about best practices for managing data sources, including Tableau extracts. I have looked in the knowledge base and tried searching the Tableau site and haven't been able to find much guidance on this. This is going to be crucial for our company if we are going to keep on top of our data.
As of about a month ago, I had been the only person at my company using Tableau. I extract all of the data sources to increase performance and provide extra functionality (i.e. count distinct). I generally store Tableau extracts in a sub-folder along with whatever Tableau workbook uses it. I often try to feed multiple workbooks off of single extracts, so that I only have to refresh one extract and get the data loaded into all the workbooks I use.
However, now my company has 8 Tableau users, and we are connecting to many different data sources, and I am realizing that my current system is not scalable, and we need a better way to store, organize, and track our data.
Issues that I have run into that make me worried about scalability:
- Storing extracts in sub-folders means if the folder structure changes it breaks links and we have to reconnect.
- It is easy to lose track of extracts in different sub-folders, especially if there are multiple versions.
- In order to use a global filter in only part of a workbook I need to duplicate the extract, which leads to untenable proliferation.
- In order to increase performance I want some extracts to be aggregated, but others to contain the full data set, which again leads to proliferation.
I would love any thoughts or guidance on best practices for storing, maintaining, and accessing extracts, as well as general thoughts about data management and Tableau. I know Tableau is used enterprise-wide in many companies, so I'm sure that there must be examples of good data governance. Thank you in advance for any advice.