4 Replies Latest reply on Dec 24, 2013 1:23 PM by Tom Hier

    data blending and joining -- basic help needed!

    Tom Hier

      I am a complete novice in the area of data blending and joining.  At this past year's Tableau Conference, the Tableau Docs came up with an ingenious solution for a viz that involved data blending and / or joining.  For a while it was sufficient just to use the workbook that they developed, but now I actually have to understand what they did.  Leaving aside the substantive aspects of the blending formulas they derived, I need some basic help in understanding how to navigate my worksheets to substitute a new dataset for an old one that is connected (or blended or joined?) to another.


      I am clueless as to where to begin, but I believe that anyone who understands the basic mechanics of data blending / joining can probably help me understand what to look for.  I'd be greatly appreciative of any help that someone can provide.  I don't want to broadly post a packaged workbook because it includes client data, but I can share a packaged workbook with any one who might be able to help me.


      I would be very appreciative of anyone with data blending / joining skills who could spend a few minutes helping me with the basics.



      Tom Hier

        • 1. Re: data blending and joining -- basic help needed!
          Phillip Burger

          Hi, Tom. For many that are just learning Tableau, blending seems to cause a lot of problems. It shouldn't but I think the problem reflects the heterogeneous data environments that are common to complex organizations. I just watched this video. I have experience blending and this video is perfectly understandable and nails it:



          Data Blending in Tableau | Tableau Software


          Read on below, but first, if you merely want to want to replace data sources, I recommend this knowledge base article:


          Replacing a Data Source | Tableau Software


          I've found that there is a hierarchy to what is the easiest type of "data integration" to use. Push all the joining and blending and manipulation down to the database if you can. Create and use a database view if you can and use the data source corresponding to your DB.  Next in hierarchy is to use joins of your DB tables using SQL. Finally, use blending.


          Some other ideas. Break down the problem. If you are using a DB, develop using text files. Export data to text files and use the text file data source.


          Next, and pick up here if your source data is in Excel/text to begin with, take some samples of your data and create one, new Excel/text file that has the data structure you need. Use this as your data source. Get it to work.


          If that works, create two sample files, being sure that each contains a field that you can match on in the blend.  You have to have a primary data source.  Rule of thumb is to use the "helper" file as the secondary.


          If you have common field names in both files, make sure they are named the same. Tableau will probably get it right in suggesting the blend.  Be sure to click the link, too.


          And, this is big, use "View Data" for each of your sources after you connect. If there is wrong or partial data in columns that it shouldn't be there, such as a date field showing incomplete dates, or a name field contains numbers, you will probably have to use a schema.ini file.  If your data looks ok in "View Data" on the sample Excel/text files, but is discombobulated with larger files, it's an indication that a schema.ini might be needed.


          Break it down and get something easy to work then build up.


          Post a sample workbook to the thread if you continue to encounter difficulty.

          1 of 1 people found this helpful
          • 2. Re: Re: data blending and joining -- basic help needed!
            Tom Hier

            Thank you, Phillip.  I appreciate the reference material you provided.  Unfortunately, I think my problem is a bit more complicated.  At the conferenece, the Tableau Docs created a new database and a formula that works off of fields in that database (DBMinute) and another database (DBCourses011).  First problem is that I cannot even find the DBMinute database in my Data window, so I'm not sure how it's even being blended or joined to the DBCourses2011 dataset.  (All my data are Excel files, by the way.)


            So, my specific questions are as follows:


            1.  Where and how do I find the DBMinute dataset?

            2.  How do I determine what formula(s) has(have) been created to link DBMinute to DBCourses2011 datasets?

            3.  Can I point Tableau to a dataset in a new Excel file so that I can analyze more recent data (i.e., DBCourses2013)?


            I've attached a packaged workbook to help you understand the problem.


            Thanks much.

            • 3. Re: data blending and joining -- basic help needed!
              Jonathan Drummey

              Hi Tom,


              If you right-click on your data source and choose Edit Connection..., you'll see this dialog:


              2013-12-20 16_19_06-Tableau - Test File from Tableau 2013 Conference Using Blended Data.png


              This tells us two things. One is the name of the underlying data source or file (DBCourse 2011.xlsx), the other is the table selection that shows that there's a DBMinute table (worksheet) in that spreadsheet along with the DBCourses2011 spreadsheet, and that the tables had two joins on the DBMinute field. I'll cover the data source part first:


              A Tableau Packaged Workbook is actually a zip file, you can unzip it to end up with a .twb file and a "Test File from Tableau 2013 Conference Using Blended Data.twb Files" folder. Inside that folder are your data sources, in the Data\01Working Data folder is the DBCourse 2011.xlsx file. You can then open that file and see the DBMinute worksheet, which is just a list of minutes from 12:00:00 AM to 11:59:00 PM:


              2013-12-20 16_20_48-Microsoft Excel - DBCourse2011.xlsx.png


              Now onto the data. There wasn't actually a blend here, instead the Tableau doctors set up a left join. Back in that Edit Connection window, you can click on the DBCourses2011 entry, then click Edit... to bring up the Edit Table dialog, then click on the Join tab to see the exact join criteria:


              2013-12-20 16_26_27-Tableau - Test File from Tableau 2013 Conference Using Blended Data.png


              That takes care of the first two questions, now for your third question. You have at least three options, I've ordered these from least effort to most effort***:


              1) In Excel, copy the 2013 data on top of the 2011 data in the DBCourses 2011 file.

              2) In Excel in your 2013 data file, create a DBMinute worksheet that duplicates the one from the 2011 file, then in Tableau go to Edit Connection on that data source with the join and change the Excel file to your 2013 data.

              3) In Excel in your 2013 data file, create a DBMinute worksheet that duplicates the one from the 2011 file, then in Tableau create a new data connection that duplicates the joins from your 2011 connection between the two worksheets, then use Tableau's Data->Replace Data Sources function.


              *** That's the effort to get your analysis working for 2013 data, I don't know about your environment for maintaining your files and file versions, that might lead you to pick a different route.


              Also, given the join and number of records, I'd suggest using a Tableau data extract, that should be quite a bit faster.



              1 of 1 people found this helpful
              • 4. Re: data blending and joining -- basic help needed!
                Tom Hier

                Jonathan,  Thank you for your helpful reply.  I'm slowly working through all of your suggestions.  It seems as though they should work for me.  I'm still pretty new when it comes to data formulas, etc., so I may run into some snags.   But at least I have a better conceptual understanding of what's happening now, and that's an important first step.   Tom