Data Blending

 

 

 

We use the Data Blending process when data is located into multiple databases. It is Horizontal merging it means Data bases are having different columns apart from common column for define the relationship.

 

Example:

 

The following example illustrates, how to use the two data sources to build the Report.

 

 

 

 

 

 

 

 

 

 

 

 




1) Connect the Tableau with "ABC" database, Then connect to Database "XYZA" by clicking on "Connect to database"

http://1.bp.blogspot.com/-d9ejRrUtSHs/VNw64MjULmI/AAAAAAAAAJ0/7R35F6rMO0k/s1600/7.png



 



2) Once we connected to database 2, we have to define the relationship between 2 databases. Here Column "A" is common to define the Relationship. 3) Click on Data on menu bar, Click on "Edit Relationship", Then Select Primary & secondary databases then define the relationship between those 2 databases.

 

Now we can use 2 databases to build  the reports.Once we define the Relationship between 2 databases, Primary Database will be highlighted inBlue, Secondary in Orange color.

 

 

 

General Issues:

If relationship will be removed, then we get the nulls when use columns.

from 2 databases.

 

====================

General Issues - Data Blending


Solutions - for General Issue

Relationship

 

 

Define proper relation between two databases.

 

1) If we don't define the relationship for blending purpose, it will show "*" symbols.

 

104.png

 

2) If we define the Relation, that "*" will be removed.

105.png

 

 

========================================

 

 

If any calculation using Both Data sources in the Calculation, we have to use that both columns in the view hence we can hide if don't want to show.

 

Database 1 Data:

94.png

 

Database 2 Data:

 

95.png

 

After connected to Databases, I renamed as Database 1, Database 2.

 

Then i have created a calculation using two data sources fields.

 

91.png

 

If we use only 1 Column in the view even 2 are used to create calculation, it wont show the Data correctly since we are calling only 1 column from 1 DB onto View.

92.png

 

If we use columns from both data sources, it won't pass error since we are calling two fields from two DB.

 

93.png

 

Now we can find difference easily

 

89.png

 

===== ======

Some Additional Information.

Taking “X” from 1 data source, “C” from another data source. & I defined the relation  (Blending) using common column “A”.96.png

 

If I drag "X", "C" then it will pass error. IF i use "X", "A", "C" then it won't pass error.


Here were getting null values for  "Calculated" field since we are not using  referenced column of two both data sources.

 

97.png

 

 

Now there are null values  since we are using referential column in the report, that is "A".

 

98.png

 

99.png

 

 

If we want to hide "A", then right click on "A" then click on "Show Header".100.png

 

 

101.png

 

 

 

======================================================== ===========================

Data blending works like Left Outer Join (All records from Primary data source, only matched records from secondary data source.)

 

Lets take an example.

 

  Data source 1 ( I am using 2 different XL  workbooks)

CountrySales
C1100
C2200
C3300
C4400
C5500
C6600
C7700
C8800

 

 

Data source 2

 

 

CountryProfit
C8800
C9900
C101000
C111100
C121200
C131300
C141400
C151500

 

Country is the Common column to define the join between 2 database.

(Country is the Common record in terms of Country column)

1) I connected to the DB1.

68.png

 

2) Connected to DB2.

 

69.png

 

3) Now I defined DB1 as Primary, DB2 as Secondary.

70.png

4) If we drag Country from DB1, Country from DB2, DB1 Country shows all records & Only matched records from secondary  data source71.png.

5) If we drag Country from DB1, Country from DB2, DB2 (Primary) Country shows all records & Only matched records from DB1 (Secondary)  data source

 

72.png

 

=====================

Here we can't show all records from DB1, DB2 (Means all 15 Country's data at a time).

Means Full outer Join is not possible by doing the Data Blending.

 

===========================

 

If we want to show all 15 countries data on the view, we go for custom SQL.

 

 

=================================================================

Some times data in 2 databases may not having the same names ( Like USA in 1 Database , United States of America in another Database) , we have to edit the names & define the Aliases then it will match.

 

Procedure:

 

1.open a fresh workbook.

2.drag country to row shelf to get a list of countries.

3.right click on any country name and edit alias (at the bottom). change it

 

to match the target country name.

 

 

 

Example:

 

The problem is the countries are in different notations: One is in short form, the other is in long form.

 

Say, one is like this:

 

Country_Short
US
UK
CN
FR

 

The other is like:

 

Country
United States of America
United Kingdom
France
People's Republic of China

 

How to blend them? Are they blendable at all? The answer is affirmative, not without a little twist though. I am listing two ways of doing it. Assume both data sources are connected.

 

1. Create a calculated field [Country] based on [Country_Short]:

Using this newly created dimension [Country], you can blend with the other data source.

 

2. Add an alias to each country in the [Country] list of the second source.

 

- open a fresh worksheet
- drag [Country] to the row shelf
- right click on each country and click the last item of the context menu which is Edit Alias
- edit the alias of 'United States of America' to be US and so on
- make sure all the aliases are the corresponding names in the other table.


Either of the above lays the ground for blending the two data sources. We are ready to blend.

 

To blend naturally, rename the two country dimensions to be exactly the same. Tableau will try to link up the two dimensions automatically through intelligent guess.

 

If you want to keep the dimension names different and still want to blend the two tables,
- open menu Data>Edit Relationships
- link up the two dimensions there, using custom method to join the two data sources.

 

Note that for some data sources such as published data extracts, you are not allowed to edit aliases. Quite a weird restriction. The workaround is to duplicate the dimension in question, and work on the duplicate.

 

You may also right-click any dimension to edit the aliases for the whole filed.
In Tableau Desktop 8: Right-click Dimension>Default Properties>Aliases...
In Tableau Desktop 9: Right-click Dimension>Aliases...


=================================

Join VS Blending

 

 

Joining is the Better Approach than Blending when two tables located in the same Database & Same Schema, Since

1) If we implement Data Blending, Can't use the "Show Relevant Value option (Cascading Quick Filter),  If we use Join then it easier.

2) If we implement Data Blending, We can't apply the Global Filter when different reports from Different Databases, If we use Join then it easier.

3) Blending woks like Left Outer Join. Joining can support all kind joins (Inner, Left, Right,----)


4) Measure Names, Measure Values does not appear in the Second Database, So Joining is better in the approach.

5) We can't apply the Action from 1 report to another in the Dashboard when different report are built using different Databases.