1 Reply Latest reply on Aug 11, 2013 10:43 PM by Jonah Kim

    Data Blending 4 Data Sources

    Shawn McClory

      Here is the Monday Morning challenge.

       

      I am working on a complex workbook containing 4 data sources.  While my example is simple in terms of data, all the data sources are separate in their originating locations and cannot be joined via SQL at this point.

       

      The example data sets are structured similar to my real data so they should provide a similar scenario.

       

      My goal is to Blend the combined total sales of both sales data sets by Car Hierarchy  and Fiscal Calendar.CarExample.png

      The Hierarchy contains only Make Model and Model Year and Reference Id.  Where Reference Id will provide the link to the Sales files.

      The Sales files contain Calendar Sale Date, Sales and Reference Id.

      The Fiscal Calendar contains several Fiscal Calendar fields.  The date can be joined to the Sale Date in the Sales files.

       

      At this point my question is how do I blend all four of these data sources to recreate the above gird.

       

      Let me know if you need further details and thank you in advance for any help you can provide.

        • 1. Re: Data Blending 4 Data Sources
          Jonah Kim

          To make the blend possible, it will be necessary to combine the data in the Sales 1 data source with the Sales 2 data source. This could normally be done with a UNION of the two tables but because it is not possible to perform a UNION with tables from different data sources, it is necessary to first extract the data and then combine them.

           

           

          The extraction and combining of the data can be done at the database level or the Tableau Workbook level.

           

           

          I will explain how to perform it at the Tableau Workbook level.

           

           

          Connect to the Sales 1 and Sales 2 data sources and then create an extract in Tableau Desktop for both (right click and select Extract Data). Take the extract when the workbook is either not saved or saved as a .twb file (not a .twbx file) so that you can save the extract files as separate files on your local machine.

           

           

          Once the extracts are created the database symbol (a single cylinder) next to each data source should show the extract symbol (a cylinder with an arrow coming from another cylinder behind it). Right click on the Sales 1 data source and select Extract > Add Data from File. Select the Sales 2 data extract file from the location you saved it. By doing this you have successfully added the data from Sales 2 to the Sales 1 table. Note that for this to work both tables must have the exact same number of columns with the same column names as the Sales 1 and Sales 2 data sources do in your attached workbook.

           

          On a clean sheet place Sale from the combined Sales data source in the view or on Text.

           

          Switch to the Hierarchy data source. Select the linking field symbol to the right of the Reference ID field so that it changes to an Orange color. Place Make and Model on the rows shelf in that order. Exclude any Null values.

           

          Switch to the Fiscal Calendar data source and select the linking field symbol to the right of the Day field so that it changes to an Orange color. Place F_Week on the columns shelf. Exclude any Null values.

           

          Of course Sales figures for Make and Models will only appear if they exist for their matching reference IDs in the combined Sales data source.

           

          On a separate note:

           

          It's possible to automate the process of updating the data using Tableau Server. This involves publishing the extract from the Sales 1 data source to Data Server and refreshing that extract using a incremental refresh (with a column that identifies new rows of data using some sort of identifying field like a Date_Added field). You would also take an extract from the Sales 2 data source and publish that to Data Server and refresh it with the most current rows of data using the incremental refresh method.

           

          Using tabcmd (our command line utility) you can generate .tde or .csv files from the refreshed Sales 2 data sources after its been refreshed and then append the Sales 1 extracted and published data source with that generated file (this can be done with the append from file Tableau desktop command). The automation can be done using a batch file that combines all the various commands and a task scheduling program that is running from a machine that is always powered on, has access to the various data sources and can connect to Tableau Server.