1 2 Previous Next 17 Replies Latest reply on Nov 16, 2017 11:46 AM by Kristin Sanders

    Data Blend Missing Dimension

    Kristin Sanders

      Hello,

       

      I've set up a data blend of two excel sheets - one that has activity participation and another that has our customer database.  I'm using the customer ID for the relationship between the two data sources.

       

      I've viewed each data source separately to see the total participation by status (completed, in progress, not started) and total customers by customer type (member, nonmember, student). 

       

      I blended the data sources with the participation info as the primary source and the customer database as the secondary and I'm not seeing the student customer type that was reported in the separate total customers report.

       

      When I flip data sources, and have the customer database as the primary source and the participation info as the secondary, I get two new status of null and * which have a large number of records, students is reported as a null participation status.

       

      Neither of the blended sheets match the original numbers from the total participation by status sheet.  I thought that my sales info excel should be the primary sheet because it would have all activity participation - including if a customer participated in multiples activities.

       

      I'm not sure what I'm doing wrong here.

       

      Thanks,

      Kristin

        • 1. Re: Data Blend Missing Dimension
          Deepak Rai

          Kristin, Can you join your data set instead of blending?

          • 2. Re: Data Blend Missing Dimension
            Kristin Sanders

            When I join the data the individual reports do not match the original excels - the blended data does.

             

            I believe that the joined reports may be double counting when the customer database information is pulled in to the participation info.

            • 3. Re: Data Blend Missing Dimension
              arvindgarg

              Is it possible to share packaged workbook?

              • 4. Re: Data Blend Missing Dimension
                Deepak Rai

                Kristin,

                Please create fake datasources similar to original and attach here.

                Thanks

                Deepak

                • 5. Re: Data Blend Missing Dimension
                  Manish Mehra

                  Hi Kristin,

                  Without looking at the actual data, I can give you few suggestions.

                   

                  1. Blending is like left outer join so you'll get only values from your primary source.

                  2. For * and Null values, the issue may lie in the data granularities and fields used in the report. You can refer Troubleshoot Data Blending and troubleshoot issue (usually * happens if the view has different granularities)

                  3. You can put the data in different sheets in an excel file and create simple joins.

                  4. If you are using the latest version, one can join data from different sources.

                   

                  Hope it helps.

                  • 6. Re: Data Blend Missing Dimension
                    Kristin Sanders

                    Attached are samples of the two excel files and TWB file.

                     

                    The Participation Info Primary Blend sheet is not showing as it had with the full data but what does make sense since the totals are adding up to the overall complete, in progress and not started numbers.  What does not make sense to me here is why the 'Student Member' Member Type from the Participants by MEM Type sheet is not showing here and why the Member Types that are showing do not add up to the total number in the Participants by MEM Type sheet.

                     

                    When I flipped the data blend to the Customer Database Primary Blend I am seeing those 21 Student Members show up.  The Status totals are not aligning to the Participants by Status, sheet however the Member Types are totaling and matching.

                     

                    I know that there are multiple activities for some customers and that only customers who have good contact information (can be reached via email) are in the customer database.  Am I missing something or reading these wrong as I'm trying to blend these two sheets?  The goal is to know who is taking the activities to be able to create customer profiles.

                    • 7. Re: Data Blend Missing Dimension
                      Kristin Sanders

                      I think you needed a .twbx file - see attached.

                      • 8. Re: Data Blend Missing Dimension
                        arvindgarg

                        Please see attached sheet. I have used left outer join with customer database sheet.

                        You can also see student member.

                        you can achieve similar using data blending.

                        1. You should count distinct ID Column ( Participant GUID  or LMS PersonID ), not "Number of records" column

                        2. Also the big problem, why don't you see 21 student members. It is because - in customer database student id are in lower case, while in participant they are in upper case. So the relationship is not happening. See below screenshot. If you want to do data blend create a calculated column - upper ( LMS PersonID). then do blending using that column.

                         

                         

                        Let me know if you need further explanation.

                         

                        • 9. Re: Data Blend Missing Dimension
                          Kristin Sanders

                          Thank you for looking into this and providing those examples!

                           

                          The capitalization nuance is something that was never on my radar before so it's very helpful to know to look for that when blending.  Also, very helpful to know that I should be using Count Distinct in some of these sheets.

                           

                          So do you recommend to join the data vs. blend it?  There are more personalized fields from the customer database that I need to look at so I wasn't sure if the best approach would be to blend the data or join it.  I was originally thinking to blend it since I'd want to look a few fields from each excel for analysis.

                          • 10. Re: Data Blend Missing Dimension
                            arvindgarg

                            blending Vs join - It depends on database size, your reporting requirements etc.You can find many articles online about this.

                             

                            After the release of Tableau 10, I prefer to use join then blending.

                             

                            If I have same primary data source all my worksheets. Then I use database join. If I need to change primary and secondary data sources for different worksheets. Then I will opt for data blending.

                             

                            I hope, I have answered your question. If so, Can you please close this thread.

                            • 11. Re: Data Blend Missing Dimension
                              Kristin Sanders

                              OK that makes sense.  Thanks again for your help!

                              • 12. Re: Data Blend Missing Dimension
                                Kristin Sanders

                                Hi Arvindgarg,

                                 

                                I have one more question.  When I use the full data set I'm getting * as a Status category again. 

                                 

                                I'm sure it must be something with my data like the upper/lowercase differences.  I've cleaned up both excels to remove the extra data columns that I do not need and I'm still seeing the asterisk - to my eye I'm not seeing anything that jumps out like the lower/upper case differences.  

                                 

                                Would you have a guess for something that I should be looking for or is that too hard to guess at without the full data?

                                 

                                Thanks,

                                Kristin

                                • 13. Re: Data Blend Missing Dimension
                                  Manish Mehra

                                  It means that status has multiple values for that join criteria/instance.

                                  Scenario: If you have name field in primary table and sales, name and status fields in secondary and you have to blend the data.

                                   

                                  Asterix will show up if view has Name|Status - Sales and there are two different values for status for a single name. e.g. - If there is one Andrew in primary table and 2 instances or more of Andrew in secondary table with different values for status(married /single) then the view will show as Andrew|* - Sales

                                   

                                  Get Outlook for iOS<https://aka.ms/o0ukef>

                                  • 14. Re: Data Blend Missing Dimension
                                    Kristin Sanders

                                    Hi Manish,

                                     

                                    That definition and scenario make sense, however I may be even more confused now when looking at my data.

                                     

                                    The sample excels that I had previously uploaded stripping out the personalization were worked on and returned the categories without the additional asterisk or null fields that I was originally seeing.  When I added back in the additional columns that were removed the asterisk came back which is why I was thinking there was some formatting with my data.

                                     

                                    To your point, it would make sense if Kristin Sanders both completed an activity and was in progress for another and therefore an asterisk would be my status.  But if that's the case then why would my sample that was worked on not also have the asterisk values?  I would think my customer ID would have been in there twice which would have triggered the asterisk.   

                                     

                                    So then your point makes me also wonder, if I'd be able to have my three clean status categories since how would Tableau know where to put a user who fell into multiple categories.  I'm confused how the sample that was returned had the correct unique number of each member type in each of the three categories.  Maybe I need to look at my data differently.

                                    1 2 Previous Next