4 Replies Latest reply on Aug 24, 2016 9:11 AM by danielj..aurit

    2 Parts: What is Published to Server with Joined CSVs



      Actually a couple questions here: This will probably be a "no-brainer" for most of you.


      Part1: Joined data... DataDump1.csv & DataDump2.csv.

      * In the Data Source window I see the files: "DataDump1.csv" and "DataDump2.csv"

      * In Tableau worksheets under "Data" tab; I see: "DataDump1" & "DataDump1+" (dump1 & dump1+)

      * Both DataDumps are available in the Dimensions card

      1a) Why don't I see DataDump2.csv?

      1b) What is DataDump1.csv? Is it the resulting data from the join?


      Part 2: Check my understanding please:

      I have a set of dashboards (Project) I want to publish (easy enough)

      The data is updated on a weekly basis.

      My thinking is that I only need to publish the updated data dumps. I would only need to update the dashboard package if updates are needed (other than new data).

      2a) How do I publish (maybe I don't need to) DataDump2.csv?

      2b) Maybe I just need to publish DataDump1 and DataDump1+?


      Thanks in advance!

        • 1. Re: 2 Parts: What is Published to Server with Joined CSVs
          Jonathan Drummey



          Some screenshots would be useful to make sure we're talking about same-same, I'll do my best. Also Tableau v10 changes things up a bit, I'll assume we're talking about v9.3 or earlier.


          Part 1)


          What you see in the data prep window are two things: the raw data you have available to you (in files, worksheets, tables, etc.) *and* the Tableau data source that you create:


          Screen Shot 2016-08-24 at 6.59.09 AM.png


          One metaphor that I use is a food one: the files/worksheets/tables are the "raw ingredients". You combine them together to create the Tableau data source, and part of what you do there is set various defaults (aggregation, colors, aliases, shapes, etc.), add commentary, and create calculated fields, this is like preparing the parts of meal. Then you  use your Tableau data source(s) to build out the views and dashboards that are the plated dishes.


          Since you're seeing two elements in the Data window in the main Tableau view that means you've created two separate Tableau data sources, here I have the VizAlerts_District1_data and VizAlerts_District1_data-2:


          Screen Shot 2016-08-24 at 7.00.26 AM.png


          I'm guessing that your DataDump1 source is just connected to the DataDump1.csv and the DataDump1+ source is a join of DataDump1 and DataDump2 because Tableau's default behavior is to append the + sign to the end of the name of the first file/worksheet/table used in a join to build the name of a Tableau data source. If you right-click on a Tableau data source and choose Edit Data Source... you'll go back to the data prep window and see exactly what is included in that source.


          I don't know which one(s) you need for your analysis so I can't tell you more whether you need just the DataDump1 or DataDump1+ or both.


          Part 2) I'm going to assume you are using Tableau Server here. If the Tableau Server run-as user (configured by your Tableau Server admin) has permissions to access the CSV files then you can set up a live connection to the files (so it would refresh pretty much as soon as the data file was updated with no effort on your part) or set up Tableau Server to automatically refresh extracts (if you are using a Tableau data extract).


          "Publish" has a variety of meanings in Tableau:.We can publish Tableau workbooks that include the data, Tableau workbooks that don't include the data (but leave it as a live connection or have the connection data embedded so Tableau Server can refresh an extract), a Tableau Server Published Data Source that includes the data, and a Tableau Server Published Data Source that doesn't include the data (and either leave it as a live connection or extract). I'm not sure which one you mean here?


          No matter what if DataDump2.csv has been included in the Tableau data source (such as my guessed-at join in the DataDump1+ source) then you don't need to publish that separately.


          For more details check out this post (by me), it's on MS Access but MS Access is a file-based source like Excel & text files so if you mentally replace "Access" with "folders and text files" most of it is applicable: I Have Wee Data – Microsoft Access and Tableau.



          • 2. Re: 2 Parts: What is Published to Server with Joined CSVs

            Thanks Jonathan, You did a great job of interpreting my question. I suppose some visuals would help...

            Much of what you wrote makes sense.

            Here's the image of the Data Source: As you'll notice we have a left join


            Image of data in the worksheet: I know I can publish the data with the package. However, I'd like to just be able to publish the new set of data as it's available (until I have it automated)...


            Thanks again, much obliged!

            • 3. Re: 2 Parts: What is Published to Server with Joined CSVs
              Jonathan Drummey

              Hi Daniel,


              Just to be super clear, the first screenshot that you showed is of just the Activities Data Dump Tableau data source, the Activities Data Dump+ source is a separate Tableau data source.


              As for which data source(s) to publish that depends on what the data source(s) are used to build the worksheets in your workbook. You can tell whether a data source is in use or not by going through the worksheets and seeing whether the data source has blue or orange checkmark icon next to the data source name. In your screenshot above the Activities Data Dump is the source used for that particular worksheet.



              2 of 2 people found this helpful