1 Reply Latest reply on Mar 22, 2012 12:50 AM by Jonathan Drummey

    Pros/Cons of a Data Extract

      I'm attempting to learn Tableau and keep getting snagged.

       

      I have an Excel file and it has multiple tables/worksheets.  When I make a join, the final data view contains duplicated rows, which from my understanding is expected.  One of the methods the forum describes for working around duplicate rows is to use CNTD (count distinct).

       

      I'm finding that when I'm using a data extract of an Excel Spreadsheet:

      1) I can use Count Distinct and

      2) I can not group my dimensions and measures by table in the left hand panel

       

      And when I use a live connection to the *.xlsx file I find that:

      1) I can not use count distinct and

      2) I can group my dimensions and measures by table

       

      Is this expected functionality?  Is there a way around duplicate rows if you're working with a live connection?

        • 1. Re: Pros/Cons of a Data Extract
          Jonathan Drummey

          Yes, this is expected, though not necessarily ideal. When you're using the live connection to Excel, Tableau "knows" what you are connecting to and creates the hierarchical grouping by source table in the Dimensions and Measures windows. When you're using the extract, Tableau has "flattened" your data and does not show that source table information.

           

          As for the duplicate rows, to avoid them you need to be careful with how the join(s) between the tables is/are defined, and possibly use Custom SQL. Sometimes there is no way around duplicate rows and you have to do some filtering within Tableau. I can't say more without seeing how your data is laid out, if you post a packaged workbook with some sample data someone here on the forums could help you out.