1 Reply Latest reply on Jul 22, 2014 7:07 AM by Jonathan Drummey

    Data Blending Assistance

    Andrew Herman

      Hi everyone,

       

      I was hoping to get some assistance with a basic data blending query I am struggling with.

       

      I am trying to compare the performance of two different contractors who are employed to undertake refurbishment works on properties. Unfortunately the data for both these contractors is stored on separate excel files so I imagine I need to carry out some data blending to get the views I require. I have attached sample data sheets which show properties being worked on by contractor A and B. I would like to compare the two contractors performance against a variety of different variables such as;

       

      • Total number of properties, (grand total and split by contractor)
      • Status of properties, (grand total and split by contractor)
      • Number of properties completed by month, split by contractor
      • Average number of days to complete properties
      • Number of kitchens/bathrooms installed

       

      I have attempted to blend the data in Tableau but am currently unable to merge the two data sources in to one view so that I can undertake the comparative analysis.


      Many thanks,

      Andrew

        • 1. Re: Data Blending Assistance
          Jonathan Drummey

          Hi,

           

          You'll need to do some sort of union to merge your two data sources together (one way I think of a data blend is that it lining up the data sources "side-by-side", and what you need is more of a "stacked" data source). There are several ways to go about this:

           

          - Merge your original data by hand.

          - If both data sets are in the same database (or Excel workbook, or text files in the same directory) then you can use Tableau's Custom SQL to do a UNION.

          - If the data sets are from different databases/Excel worksheets, you can create a federated data source using linkes servers/tables/worksheets. ODBC is a pretty universal technology for doing this, you can find more by doing a web search on "linked servers", "linked tables", or "linked worksheets" for your specific data sources.

          - Alternatively, you can build a scaffold data source that has all the dimension values, then you can use that as the primary data source in a Tableau data blend and your original data sources as secondary data sources and calculated fields to pull them together. The scaffold does not have to be manually built; it can be dynamically constructed using queries - usually some combination of union, left join, and/or cross product/Cartesian joins. There are also possibilities to do build some of the scaffold in Tableau via Tableau's data densification features (domain padding and domain completion). The queries to build the scaffold may live in one of your data sources or can be built as Custom SQL in Tableau. Joe Mako did a Think Data Thursday on this topic at: http://community.tableau.com/message/213370