3 Replies Latest reply on Nov 30, 2012 2:03 PM by Catherine Rivier

    Summary Results when using two data sources

    Ron Chipman

      I am blending two data sources together to measure the success of a campaign for our marketing team, and I can't remember the work-around shown at the TCC for Version 7 to get a summary result versus a huge table showing every customer record.

       

      Anyone remember or know?!

        • 1. Re: Summary Results when using two data sources
          Catherine Rivier

          I (unfortunately) wasn't at TCC, so if there's an easier solution than the way I do this, I'd love to know!

           

          But I do have to do a lot of data blending.  The way I do it is 2 part:

           

          1. Move the individual customer record (linking field) to the Level of Detail.  This will give you summary info, but with all data overlapping.
          2. To get rid of overlaps, use this calculation instead of your original calculation:

                         IF FIRST()==0 THEN (your calculation) END

           

          It's basically only looking at the first record in each partition, so if set up right it will only show the first record.  If summing, you'll need WINDOW_SUM or things like that.

           

          Here's a one really great post on that second part:

          http://community.tableau.com/thread/118885

          • 2. Re: Summary Results when using two data sources
            Ron Chipman

            Thanks Catherine,

             

            I'm still very new to this, and I think I'm overthinking it here.  Marketing has different codes for their segments on the list of "candidates" and then on the "results" sheets.  I am joining on email address (I know this is ugly, but its what I've got).


            I simply want to know which "candidates" actually purchased.  The two files are semi-congruent, so I have data on both pages that don't exist on the other.  I'm using the results page as my primary source.  What is the calculation that I should use to see the sum of the binary result of 'exists' and 'does not exist'?

             

            In excel, this is easy, but I want to keep it in Tableau, so I'm struggling like mad here!

            • 3. Re: Summary Results when using two data sources
              Catherine Rivier

              Linked tables are always tricky in Tableau, that's why it always gives you the friendly recommendation to join in the data source if you can

               

              Attached is an example.  Not sure exactly what your data looks like, but I just made a sample file with 50 e-mail addresses on sheet Candidates, 31 on sheet Results - no duplicates on either sheet.  There are 26 that match, 5 that are on Results and not Candidates, and 24 on Candidates not Results.

               

              Just as a quick tip here on joins:  if I have to link a long text field like e-mail address, I always like to make a copy of the field to use as the link, where I use TRIM and change to all uppercase.  I also give the linking fields exactly the same name - saves manual work as Tableau will automatically recognize these as links.  So both have a new calculated field EMAIL-LINK:

                   trim(upper([EMAIL]))

               

              You can see the results of the link in tab Linked.  Then in tab FinalCount, there are two results.  First, I pulled both EMAIL-LINK fields from both data sets into the Level of Detail page.  Then I created a new calculated field, UniqueEmails, that's just a COUNTD of EMAIL.

               

              You can see just pulling UniqueEmails onto the sheet you get that overlapping text issue - I left it here as example.  The way to avoid this, is to use the calculated field UniqueEmails-NoOverlap.  This uses the FIRST()==0 trick in the discussion above, which will be your solution most of the time in these situations.  This field is:

               

                   IF FIRST()==0 THEN

                   window_sum([UniqueEmails])

                   END

               

              This should be partitioned (e.g. the window should be) by EMAIL-LINK (select Compute Using... then EMAIL-LINK).

               

              Final piece is that if you only want the count of records where they match, filter the linked EMAIL-LINK (the one in Candidates) to exclude Null and * (* being the non-matching values).  Without this you get all values in Results.

               

              Is this what you were looking to do?  If not, feel free to submit a sample data set and I'll be happy to take a look!

              Hope this helps,

              Catherine

              1 of 1 people found this helpful