2 Replies Latest reply on May 22, 2017 8:57 AM by David Hwaszcz

    Maintaining one source of truth, while uzing multiple extracts

    David Hwaszcz



      I have a source table that has 100 columns, and ~25 million records (not tableau friendly).  To enhance performance in my workbook, I've created two (2) aggregated extracts for the high-level views of the data, and then an extract of the full data set when a user needs to drill-down to the most granular detail.  I'm using Tableau Server for scheduled refreshes of the extracts, and saved Data Sources to manage metadata.


      My question is on managing the Data Sources - Is there really no way to manage one source of truth?  I don't believe I can place three (3) .tde files into one (1) .tdsx file, the flexibility doesn't seem to exist.  So if I rename a field in one of the .tdsx files, I have to make the same change in the others.  How are others managing this?



      .tde #1:  person_id, category, subcategory, product, dollars

      .tde #2:  category, subcategory, product, dollars

      .tde#3:  category, dollars




        • 1. Re: Maintaining one source of truth, while uzing multiple extracts
          Vincent Baumel

          I don't have a solution, but I'm curious about your 100 column table. Are you using all 100 columns in one way or another? We've got some pretty wide tables too, but I've created extracts that only bring in the fields I am using in the visualizations/calculations, and it reduces the column count by quite a lot. This helps the whole thing to run a lot faster. Is this similar to what you did with the aggregated extracts?

          • 2. Re: Maintaining one source of truth, while uzing multiple extracts
            David Hwaszcz


            Our ETL Team has created a 100 column flat table (materialized view) for this dashboard due to externally sourced data which requires a heavy amount of logic to relate it to our proprietary data.  I'm actually using two (2) aggregated extracts for the C-level & VP dashboards in workbook (a), with the ability to drill-across to workbook (b), which uses the 3rd extract (all 100 columns).  I'm in healthcare and once we get to the patient level, there are an excessive amount of dimensions needed to contextualize the situation.  But, this level of detail is often times needed in order for us to provide actionable data to case managers.


            I'm still fairly novice to Tableau (we've been a Cognos shop for many years, introduced Tableau in March 2017), so our approach may need some adjustments.  This seems to work performance wise, but I'm scratching my head at how to maintain one source of truth with multiple aggregated extracts.  The closest solution I've come up with is to maintain them all in one workbook, each of them being a separate worksheet, and when making a change to one, make the same change to all that utilize that data item at the same time.