8 Replies Latest reply on Sep 19, 2016 1:58 PM by Parthasarati Dileepan

    Primary and secondary data source in blending

    Parthasarati Dileepan

      I want to understand how the records from the two data sources are blended ... like in join we have left, right, inner, and outer.

       

      For example:

      Data Source 1: Customer, Area code, Sales

      Data Source 2: Area Code, State

       

      I am blending these two data sources to create a filled map.  If I start with Data Source 2 by putting state into the chart, which makes it the primary source, then go to Data Source 1, I am able to generate the filled map.

       

      But, If I start with Data Source 1 as the primary and make sure it is linked to Data Source 2 with Area Code, I am not able to generate either Filled Map or Symbol map. 

       

      I want to understand why this so.

        • 1. Re: Primary and secondary data source in blending
          Simon Runc

          hi Parthasarati,

           

          I don't have much time currently, but saw your post and is one of my fave subjects!!

           

          If you look at my response here

           

          Help! Calculating a % with an aggregated and non-aggregated number

           

          and also a blog I wrote on the subject

           

          The Importance of Granularity (…to Blend or not to Blend?) | The Data Animators

           

          they will hopefully get you started.

           

          Let me know if after reading these you still have some questions, and I can answer those specifically

          • 2. Re: Primary and secondary data source in blending
            Matt Lutton

            I suggest you also study the "Data Blending: How it is (and is not) like a left join" at the link below -- this is the video that finally helped me wrap my brain around Blending in Tableau:

             

            TDT Video Library

            1 of 1 people found this helpful
            • 3. Re: Primary and secondary data source in blending
              Parthasarati Dileepan

              Thanks for the responses, they are great in trying to understand the difference between join and blend ....

               

              but my original problem still is not resolved ... if I start with Data Source 2 and bring in State, the geo location data is there and the map renders fine.  But if I start with Data Source 1 and make sure the link is fine, the geo data for state is not present, blending occurs fine, but Longitude and Latitude is not there and no maps ...

              • 4. Re: Primary and secondary data source in blending
                Simon Runc

                So the reason this occurs is due to how the blending works...from the above links/videos the key take-out is that a blend performs as an aggregate and join...meaning that any fields that are brought in from the secondary source always come in as aggregates, and we are unable to set an aggregate measure a geographic role.

                 

                So if I try to bring in State, from DS2 into DS1 to use as a field...you'll notice it is wrapped in a ATTR, and can only appear in the measures pane (not the Dimension pane). It is an aggregate calculation

                 

                 

                Even if I just stick to a single data source and try and create an (aggregate) calculated field, which returns the state in certain circumstances....

                 

                I do not have the option to give this field a 'geographic' role...

                 

                If we ignore the 'mapping' requirement, and just use the data-sources to create a summary by state, using DS1 as our primary source, we can do it...but it gets pretty complicated. We first need to have PostCode in the vizLoD, so the aggregate State (from DS2) is displayed as "each-state"...remove the [Postal Code] field from the 'Need PC in LoD' tab and see what happens. This means we get multiple values returned per state (one per Postal Code)...and so we need to get a little funky to only return a single value (if you are interested, let me know and I can explain how the attached solution works, using the IF FIRST()=0... and the 'ignore in Table calculations' option).

                 

                What you are proposing is using data-blending for Master-data dimensions, for which blending was never intended for. Pre Tableau 10 there was a use-case....if data resided in different data sources (eg. you have sales by zip-code in SQL Server, and have a zip-code -> State lookup table in Excel), but as we can now join across data-sources, if a 'proper' join is appropriate (as it would be here) that's what we should do. The main use-case for blending now (IMHO) is when the data has 2 different levels of granularity (as I discuss in my blog post), and this is where blending really comes into its own (our DBA's are pretty jealous of this feature!!).

                 

                I hope that helps explain things a little, but please post back if any of this doesn't make sense.

                1 of 1 people found this helpful
                • 5. Re: Primary and secondary data source in blending
                  Parthasarati Dileepan

                  Thank you Simon for taking the time to answer my question is such detail ... I learned a lot from your responses, I truly appreciate it ..... D

                  • 6. Re: Primary and secondary data source in blending
                    Parthasarati Dileepan

                    The more I think I understand the more I feel I am in over my head

                     

                    [1] I notice in "Need PC in LoD" sheet SUM(Sales) is aggregating to the level of PC and lists each PC level sum along the row for the corresponding State, and not the grand sum for the state like it would in a Join.  So I am thinking in a Blend the individual aggregated value at the level of granularity of the blend will be listed, and to sum to state level we need the more complex process, am I correct?

                     

                    [2] If all I need is sum to the State level and don't care for individual PC level sum, then without the complex process, just removing PC from the VizLoD gives me the state level totals, please see attached.

                     

                    [3] I am thinking, if you need both, i.e. PC level sum as well as State level sum, then we need the more complex process involving some magic.  This may be way over my head, I don't want to take up more time than I already have, please indulge me if it is not too time consuming.

                     

                    Thanks again, this is a great forum ....

                    • 7. Re: Primary and secondary data source in blending
                      Simon Runc

                      The more I think I understand the more I feel I am in over my head

                      ...so yes as you get deeper into (the rabbit hole of) Tableau, there are certain nuances

                       

                      Question 1 - Yes that's correct (before FIXED LoDs this was how we used to do that kind of thing). I think this is best demonstrated by the example in my blog

                       

                      As the Blend is on State & Year, we are getting the equivalent of {FIXED [State], [Year]: SUM([Population])}

                      Blend Solution

                       

                      If I remove the blend on State this is now the equivalent of {FIXED [Year]: SUM([Population])}

                      Remove State From Blend

                       

                      So in the second case, the blend is making a table (btw this is just a mental model!) which is

                       

                      SELECT

                      [Year],

                      SUM([Population])

                      GROUP_BY [Year]

                       

                      and then LEFT JOINING this aggregated table to the primary data source on [Year]...which is why the value is the same for every region.

                       

                      In the 1st example its

                      SELECT

                      [Year],

                      [State],

                      SUM([Population])

                      GROUP_BY [Year], [State]

                       

                      and then LEFT JOINING on [Year] and [State]

                       

                      So by duplicating the data-source, blending them, and changing the 'blend' fields...you can do some quite cool stuff (although LoDs replace a lot of these use-cases today)

                       

                      Question 2 - Yes exactly. Probably didn't create the best example! but really wanted to demonstrate how when a field is brought into the primary source, from the secondary, it is brought in as an Aggregated field, and so needs the [Postal Code] in the VizLoD to be able to detail out each state...which is when we then need to get funky with the Table Calcs.

                       

                      Not on this subject, per se, but I wrote a 'high' level mental model on the way I like to think about the different calcs in Tableau (particularly the on-canvas/off-canvas part) that might help (btw this isn't a comprehensive document on calculations!)

                      Answer - Quora

                       

                      Question 3 - So you can, in the way you have done it in Sheet 3, you can use State (from DS2) and Postal Code (from DS1), and thus avoid the 'head-hurting' table calc method. This is why I may not have given the best example! (I was really trying to demonstrate how a dimension from secondary, brought in as a field to the primary [State from DS2] comes in as an aggregate (ATTR, MIN, MAX for Strings).

                       

                      Hope that helps.

                      1 of 1 people found this helpful
                      • 8. Re: Primary and secondary data source in blending
                        Parthasarati Dileepan

                        I briefly browsed the post, I need to spend quality time reading and rereading it, which I will do later this week ... for now I want to express my appreciation for the generosity you have shown with your time and knowledge, thank you .... Dileep