Skip navigation

Akriti Lal's Blog

September 2016 Previous month Next month

In this exercise we will try to visit the Analysis that Tableau help us Quickly Analyze for any particular data set. Again we use “Sample Superstore” as everyone is acquainted it with, but these analyses can be very well done with any other data set and tableau gives a quick picture regarding the data in no time. So let’s get started.

If you like the article you can put a comment below and if you have any suggestion / feedback send me an email at or drop a message at my linkedin profile .

As soon as we connect Tableau to our Data Source below are the operations I prefer doing in order to get a Quick Picture.

1.Measure Name & Measure Value: 

Double click on “Measure Values” in Measures.  Make sure to convert any Non Measure into Dimension before doing this i.e. Say you have a field Item Id, since it is a number, Tableau will keep it in Measures implying it will do the different level of aggregations to it. However, “Item Id” is the unique identifier and you do not need to do any calculations on it, so this needs to be converted as Dimension.

The moment you click on Measure Values, you get a bar graph of Measure Values with Measure Names I.e. Discount, Profit, Quantity, Sales and Number of Records. This allows you to have a Quick Summary regarding the data set that you have.

So here you can see that you have given a discount of 1561, there are total 9994 records in your data set, a total profit of 286,397 has been made from a total sale 2,297,201 of 37,873 items.

So you get a Summary of the Data that you have.

2.Number of Records: Next in a fresh sheet you can double click on the “Number of Records”- that Tableau Automatically creates as a Measures. This gives you the Number of Record that is present in Excel.

Depending on the uniqueness of the data, the number of records can also give other information, for e.g. if our data set is such that we have a Distinct Customer Name in each record then the Number of records would give the Number of Customer, similarly if our Data set is such that each records give the Distinct Number of Order, then number of records give the total number of Orders Placed.  So in that case we can even rename “Number of Records” as “Customer Count” or “Order Count” etc.

That being said in order to understand what the “Number of Records” is actually representing we need to understand the Granularity of our Data. Sample Superstore gives a good opportunity to explore the different level of granularity that real world data could be and i really believe that we need to spend sufficient time during these initial phase of Data Understanding , Data Mapping and Data Designing in order to get the Visualizations correctly from tableau and the way we desire.

3. Understanding the Granularity of the Data : Validate if “Order Id + Product Id” makes unique combination  : 

Can you find out what does each row in Sample Super Store signifies? Does this mean that each row is Unique for each Customer or Does each Row tells you Distinct Order Id or if the Order Id is getting Duplicated on what grounds??? Spend some time on this

To validate this, make a calculated field “OrderId + Product Id” as below and add “Number of Records” , and you will see that still this is not unique and for some lines we have more than 1 same product id per order


If you click on one of the “Distinct Product & Order” and right click and see the details in Full Data, you will see that the Order Id and Product Id combination is unique, but for the same Order id, the same product was asked in different Quantity or in other words for the same Order id, the same Product was initially asked to give 9 quantities and then said, give me 6 more (which is a valid scenario) .


So after analyzing the sample super store data we can say that each row signifies that it has Distinct Order Id for Distinct Product. So essentially the super store dataset shows the total count of Order and Product.

So the Question for this Analysis could be, how many times a Customer Ordered a certain quantity of Product Id initially and then later changed his mind and asked to Add additional quantities of that Product again for e.g. You called Pizza Hut or Domino’s Pizza and first asked to give 6 Corn Pizza, 1 Garlic Bread, 1 Pizza Burger and then you suddenly realized that you would need additional 9 Corn Pizza more. so this is the scenario in which you will have this data set.



Basically the Question that we can also Ask is

“Which Product has been Ordered the maximum Number of times in any Single Order”?


“We want to award the Customer that has placed an order for maximum quantity of any single product in One Single Order. Name the customer?”


The Answer is Customer “Sanjit Jacobs” on 16th March 2014, placed an order for 16 Xerox 1964. This is the single order with maximum quantity of any one product. Also while Ordering he first said give 14 Xerox and then later asked to add 2 more of them.And the solution for this can use Combined Field to create Unique combination of “OrderId and Product Id” and then check for the Quantity.

The reason we did spend so much of time understanding the Granularity of data is, I believe if you get to understand the granularity of your data set you are half done. This helps you in Data Understanding, Data Mapping and Data Designing. And if you get these 3 things done in the right way, getting the Desired Visualization becomes relatively very easy. 

Once we are have a clear picture about the different Measures in our data set , the number of records , the granularity of the data and what the number of records actually represents we are all set to go further and explore the different possible insights that the data has to present. We will deal with these Data Insights in our subsequent blogs using Table Calculations and Calculated Field.

So we can see and appreciate the Beauty of Tableau 10.0 . There are many more cool features in Tableau 10.0 which we will keep visiting one by one . For that keep watching this space and if there
is any topic that you would like to be covered next , feel free to drop a message in the comment below .

If you liked the article you can put a comment below and if you have any suggestion / feedback send me an email at or visit my linkedin profile at .

Tableau 10.0 Beta version talked about wildcard union in Tableau 10.0. However, the official release of Tableau 10.0 does not have the wild card Union. Tableau 10.0 “Union” has the same union functionality that was introduced in Tableau 9.3. So probably we can expect wildcard union in the future version of Tableau 10.x release.

Use Case for Tableau Union introduced in Tableau 9.3


So Today we would like to Quickly go through a use case explaining the “Union” introduced in Tableau 9.3. Once again we take the “Sample Super Store” data set and make changes accordingly so as to understand the concept. The Reason we always take “Sample Super Store” to explain the concept is, everyone is familiar with the data set so we do not waste our time exploring and explaining the Data.

In order to understand “Union” in the easiest way, let us consider the scenario that rather than having all the Customer related Order data in “Orders” tab of sample Super store, say we have multiple sheets containing the details about the Orders of each customer.

For e.g.  the below excel has different sheet with the name of customers and the different sheets contains the details about the Customers.  So the Data is not all present in one file rather the data is split into Little files and you need to do the analysis. Tableau Union helps you to piece together data that has been split into little files.

  First Sheet contains “Orders Data for Customer Aaron Bergman”


Second sheet contains “Orders” data for customer “Aaron Hawkins”.


Now in such scenario, if we want to do the analysis on complete data set , Union comes as a Handy tool introduced in Tableau 9.3

Once you connect to Super Store Customer Data from Tableau 10.0 , you will have the below screen

You cannot create a Join here because it is the same Data for different customers. So Before Tableau 9.3, you would have to copy and paste all the related Customer Data in a single excel sheet to do your analysis. But now you can just use the “union” and drag and drop all the different sheets to do your analysis.

So Double Click on “New Union” and using shift button select all the customer orders data and drag it to the Union window



Click on “Ok” and now you will have Data from all the different customer excel sheet at one place to do your analysis.

One more thing that needs to be noted here is at the end of all the columns, you can see a column that Tableau created “Table Name”, this essentially contains the name of the file from which each row originated.

So Now you can go to your worksheet and do your analysis as if this was one single file without any copy paste workaround.

This the beauty of Tableau. Keep Enjoying and mail me at or you can reach out to me at

K-means Clustering in Tableau 10.0


K-means algorithm is used for Clustering in Tableau 10.0. Let us understand the Definition of “Clustering” and some details about the algorithm “K-means” in its simplest form so that we clearly understand what we are trying to achieve.

Clustering is the partitioning of a data set into subsets (clusters), so that the data in each subset share some common trait.

K-means: For a given number of clusters say “K” the algorithm partitions the data into “K” clusters. Each Cluster has a center (centroid) that is the mean value of all the points in that cluster.

If you are in interested in understanding the inner mechanism of the details of the K-means algorithm that Tableau uses, you can go through the use guide. In its simplest form you can understand that Tableau uses the Calinski-Harabasz criterion to assess cluster quality. The greater the value of this ratio, the better the cluster. So If a user does not specify the number of clusters, Tableau picks the number of clusters corresponding to the first local maximum of the Calinski-Harabasz index. By Default, k-means will be run for up to 25 clusters if the first local maximum of the index is not reached for a smaller value of k. The users can set value to a maximum of 50 for the number of cluster.

Hands On Example:

Enough of theory now, we will do what we enjoy most i.e.  Hands on with Clustering in Tableau 10.0. We will try to create a use case with sample superstore as we all know this data very well and we do not want to waste time understanding a new data set. That being said the example that we go though is easily implementable on any other data set as the concept remains same.


With sample super store data set, we want to do some quick resource planning. We basically want to figure out how many Sales Person do we need to place for the Sales Territory that we define. So in order to define Sales Territory we need to found out the cluster of states that share some common trait and based on which we can plan our resources.

In this example the feature that define the common trait would be the “Total Number of Customer” and the “Total Quantity Being Sold"

We already have the measure “Quantity”. In order to know the total number of distinct Customer, we create a calculated field “CustomerCount”



Now we drag the “States” on the Map and drag “Customer Count” and “Quantity” to the “details” marks

Next go to the Analytics (tab) next to “Data” and drag and Drop “Clusters” to the visualization.

[ ]

This will automatically pop up the below dialogue box with the variable “Sum(Quantity)” and “Agg(CustomerCount)” in the Variables [ remember we had put these two fields in details] and these variables are being used by Tableau to compute clusters. We can add additional variables to it, if we want more of them to be used for computing cluster.

Also in the background Tableau creates 5 clusters on the “color” shelf and mark the different states with different color accordingly. Please note since user has not specified the number of clusters, Tableau picks the number of clusters as -5 corresponding to the first local maximum of the Calinski-Harabasz index.

If we want to identify more number of clusters in our data, we can provide a value based on our requirement and less than 50. We will leave it to 5 for now for our further analysis.

Also, in this example I have used “Quantity” and “Customer Count” as variable to compute the cluster. There is no guarantee that these are the ideal fields to be selected. Clustering is an iterative process of Analytics I.e. Experimentation leads to Discovery leading to more experimentation.

So we get the below visualization on point Map for the 5 Clusters that Tableau Identified based on the Quantity Ordered and the Number of Customers variable.

We can change it to “Filled Map” as my personal favorite and put Clusters on Label to identify the states belonging to different cluster.


To do further analysis on the Cluster that tableau provided, you can generate a cross tab of the data and finalize if it suits your need or you would like to make some changes.


So as we can see California is kept singly in “Cluster2”  ,while “New York” and “Texas” are placed in “Cluster4” . Based on our analysis if we decided , it would be better to have 7 Clusters then we can go ahead and edit the “Cluster”


Click on the Clusters on the “Color Marks” shelf and select “edit clusters”


Enter Number of Clusters as “7” and you can see 7 Clusters getting created in the background. Click the “X” button remove the Clusters screen.

And you can see 7 clusters now

We can do further analysis to finalize the cluster structure. Once this gets finalize we can create these 7 Clusters as “Custom Territory on Map” (Another new feature Introduced in tableau 10.0).


Convert “Cluster” into “Custom Sales Territory”

Drag and drop the Cluster from Marks Sheet to Dimensions

You can rename this as “Custom Cluster State” and Drag and Drop it on Marks shelf replacing the earlier cluster.


After that remove the “state” field from the Marks and you get the Custom Sales Territory define as per the K-Means Cluster that you identified.



Now you can use these custom sales territory for your resource planning and also have a quick look on the sales / profit that happened across these clusters to further strengthen your analysis.


This is the beauty of Tableau . So keep enjoying it and put your comments below if you liked the article. Also Feel free to send me an email at or reach out to me at

Working With Custom Territories on a Map In Tableau 10.0


Tableau 10.0 gives you the feature of working on your own Custom Territory on Geographic Map.

There can be several use case for this and one of the simplest example can be that for Our Resource Planning, we have our custom Geographic Area created other than the one that actually exist and want to visualize that on the Map.

So for example, Our Business has different level of penetration in Different States of US and so we need to plan the resource accordingly. For e.g. “California, Oregon and Washington” is treated as a single entity and we call this as “4D” and “Florida, Michigan & Ohio” is treated as another entity and we call this as “6F” and want to visualize it on Map.Similarly, we can group together all other different states and create Custom Territories for our Analysis.

Custom Territories Can be confused with Geographical Groups, so we will do visualization both with Tableau 9.3 and Tableau 10.0 so that we can understand the advantage that Tableau 10.0 brings and how it was previously handled .

Based on the Use Case Defined above we group all the different states of US and name it as Custom Territory.  So as shown below we have a Group “Custom Territory” created as shown below:

Now if you drag the CustomTerritory on the map in Tableau 9.3 , the Geographical Map will not be able to identify it and you will see “10 Unknown” at the right bottom of the map.


If you go on to Edit locations , you will get the below screen . Tableau 9.3 is not able to identify "Custom Territory” and the Data within i.e. 1A , 2B, 3C …  Also since you can not attach Multiple matching location to Data , so you will not be able to attach 3 different states as Matching Location to 1A




If you do the same stuff in Tableau 10.0 i.e. drag and drop the Custom Territory on the map , Tableau 10.0 is able to identify the custom territory and able to show map as per the custom definition




To get a better view, you can drag and drop the Custom Territory to “Color” and “Label” and you can see Your Custom Geographic Location exactly the way you have defined.



Also if you go to Edit Locations for these in Tableau 10.0 , you will see matching correct US Locations , and the map showing the correct Custom  Territories.



There is another way in Tableau 9.3 that we have been creating Geographical Group.i.e. Drag And Drop State to the Map. There after Select “Washington, Oregaon and California” and click on Group Icon, and then select “Arizona , Idaho , Montano and 2 others” to group it . If we make it a filled map and also put on the Label we get the below visualization. The two grouped states are shown in color and the Others in “Other”. 


If you repeat the above step in “Tableau 10.0” you get similar results.( below screen shot , difference can be seen in the marks shelf for Colors and Text)


Now if you remove the original geographic location “State” from Marks shelf in Tableau 9.3 , you get the below i.e. The entire Geographic Location that we are considering in group is not shown in color now. The Group is sitting on the top of the base unit and Tableau is not able to Identify the remaining “Other” and is shown as “1 Unknown”.



Where is in Tableau 10.0 , if you remove the original geographic location “State” from Marks shelf , you will get the below visualization i.e. the custom Territory are distinctly shown



and if we want to rename these custom location as  “A” “B” “C” , go to edit groups , and rename it and you get the below



And if we want to Do Analysis for these 3 custom Territories say “Sales” or “Profit”, we can visualize it on map as usual.



So This is the Beauty of Tableau, and Keep Enjoying as Tableau continues improving the Life of Analyst by bringing in cool new features for your complex analysis. If you have any questions / concerns / feedback put down your comment below or feel free to mail me at or reach out to me at


Akriti Lal

Blending In Tableau 10.0

Posted by Akriti Lal Sep 4, 2016

                                                                       Blending In Tableau 10.0

Tableau 10.0 comes with the feature of “Cross Database Joins” that allows you to create Joins from Data fetched from different data source. We have already covered “Cross Database Joins in tableau 10.0” in our earlier post in this blog. If you have not gone through it I would suggest you to visit the post once before you go through this so as to make complete sense.

Once you go through cross database joins in Tableau 10.0, the obvious question that pops up in our mind is, are we done with Blend. Would we not be requiring Blend any further? Or If we still need blend what would be the scenarios in which we would still need to create Blend.

So we will go step by step and cover both Blend and Joins with a Hands On Example so that it all makes sense.

First of all, what is Blending? The definition from tableau guide “Data blending is a method for combining data that supplements a table of data from one data source with columns of data from a second data source.”

In Tableau we generally prefer Joins for this kind of Data combining, however for the Below two Scenarios we will still have to use “Blending”

  1. Data is at Different Level of Detail say for e.g. Actual Sales Data that we receive are on City Level while the Planned Sales Data is at State level. In such scenarios creating join would cause Duplicate data and since Blending works after aggregation, Blending would be required.
  2. If we want to combine Data from Different Data Source that is still not supported by cross Database Joins.


To have a hands on of the first scenario i.e. Data is at Different Level of Detail , we will go through the example of  Sample Super Store  Actual Sales are at City Level / Postal Code Level  , while the Planned Sales that we get is say at state level .

For this example, we already have the Sample Super Store Data where we have the Actual Sales on Postal Code Level. We will quickly create a Planned Sales Data at State Level so that we can create a Blend between the two data set and analyze the result.


Creating Planned Sales Data

Connect Tableau to Sample Super Store , and create a visualization of State and Actual Sales.



Export this data into excel sheet, and just to create some planned sales data, we will generate a random number using the rand () in excel, add 1 to it, and then multiply this to actual sales to generate the Planned Sales.

There after we can copy and paste only the value of planned sales in one column and remove the other columns to have the state and planned sales.


Planned Sales















District of Columbia








































New Hampshire


New Jersey


New Mexico


New York


North Carolina


North Dakota










Rhode Island


South Carolina


South Dakota














West Virginia







What we have done here has nothing to do with Cross Data base joins or Blend, we are just preparing the Planned Data set to simulate the real world scenario where in we can have the Planning done at State Level, while the Actual Sales that we would get would be at Postal Code level.

Now Our Boss wants us to do the analytics and show the Planned Sales for each State and the Actual Sales next to it, so as to compare how we did compare to our planning.Now if we join our tableau to the new “Planned Sales Data” created using Add button , joins get created

There after when we visualize the Planned Sales for each state

We can see that we are not getting the correct Data , like for e.g. if you see for Alabama the Planned Sales is coming as 1,311,902  , while the actual planning (can be seen in the planned sales excel) is only 21,507.  This means some duplication is happening. To find the root cause , if you drill down to the Data detail for Alabama , you can see the below


Actuallay what has happened since we have created join, the planned sales 21,506.59 has been duplicated for 61 times ( as many city as state Alabama has ) and so we are getting as 1,311,902  .


So In order to do our analysis correctly what we want to do is Aggregate the “Actual Sales” data at State Level, when we Blend it with Planned Sales which is at “State Level”. So this is an example of Data at different level of details and why we would need to use Blending rather than joining. In order to proceed further go to Data Tab and remove the “Planned Sales Per State” Join Condition.

Blending to Visualize the Actual Sales and Planned Sales at State level


In order to create Blend between Actual Sales and Planned Sales , follow the step below . Add the new Data source as below

You will be able to connect to Planned Sales excel as below


After creating this new connection, when you go to the sheet, you can see the new Data source added for creating Blend


Now we need to define the “Edit Relationship” in order to create the Blend

This opens the “Edit relationship” window that and here we can define the blend condition which would be “State” in our case

Now before visualizing the planend sales we need to validate if the Orange Link is enabled in the secondary sheet ( Planned Sales)

There after you can drag Planned Sales as below

So we can see, we get the right Planned Sales and Actual Sales now as Blend works after aggregation. So basically when Actual Sales was blended with Planned Sales, the Planned Sales for the entire state did not get duplicated rather was calculated at the state level of aggregation.

If you have any questions / concerns you can reach out to me at or reach out to me at

Concept of Tableau Blend and Tableau Join before Tableau 10.0 and The Easy way that Tableau handles it through cross Data Base join in Tableau 10.0

Cross Data Base Joins In Tableau 10.0

We Explain this concept through the classic Sample Super Store but in order to explain the Blend, join in Tableau 9.0 to 9.3, we have changed the three Sheets of Sample Super Store in Three different Excel. So rather than having 1 Excel Sheet having three sheets, we have 3 different Excel Here.



Now before Tableau 10.0, if you want to do an Analysis on these 3 Excel, you will have to create a Blend (In order to use Join, you will have to bring the data from the three Excel in one Excel sheet, only then you would be able to use join).

Using Blend to Connect to Different Excel ( Before Tableau 10.0 )

Open your Tableau Desktop and connect to SuperStore-Orders.xlsx . This creates the primary connection.


Thereafter click the Database+ button at the top , and add the secondary connection with the “SuperStore-Returns.xlsx”



Again press the Database plus button at top and connect with your “SuperStore-People.xlsx”


There after go to your Tableau “sheet” and you would see all your connection as below



Click on Orders as this is your Primary Connection and there after click on Data – Edit Relationship as below



You will see both the “Relationship” already created between primary and secondary based on the name . Validate it ,if it is correct. If it is not , click on custom and change the relationship.


Now suppose you want to see all the Orders, with the “Returns” and the Region “Person” responsible for each of the Orders.

So Drag Order Id from Order-SuperStore Orders. To get the Returns go on Return Sheet. Check if the Link on OrderId is always created (showing as Orange marked below), if not click to create it. And Drag “Returned” on Colors.


Now to see the People Associated, go to People, click on “Region” to highlight the “Orange” connection and there after drag the People on to the visualization.


Tableau treats 3 different Excel sheet as 3 different Data source so you have to create Blend in order to use the Data from them and we know that Blend is not a Pure Join.

If we wanted to use “Join” in case of “Blend” before Tableau 10.0 , one of the option was to copy the “Return” and “Person” from different excel to Different Sheet of the “Orders” excel . Experienced Excel users can create Macros so that the Data from different Excel shows on the same Excel in different Sheet.

Tableau 10.0 comes with cross Data base join feature, so for different Excel Tableau 10.0 allows you to create Joins. The same example that we dealt above is shown below step by step to be used in Tableau 10.0. This is shown for Different Excel sheet that is being considered as different Datasource by Tableau , however the same approach can be used for analysis if we have one of our Data in Excel and the Other in Microsoft SQL Server or MS Access Or Hadoop . This is essentially the Cross Database Join feature that Tableau 10.0 comes with and is really a great feature .

You can have a use case , say the Field Engineer captures the Data in Excel sheet regarding the Lead , and you want to compare the Excel Lead Data generated with the Customer Data already stored in MS Sql server database. So in all these scenario Tableau comes with a very handy feature of “Cross Database Joins” that allows you to directly create a Join between different Data Sources and do your Analysis Quickly.

Using Cross Database Join in Tableau 10.0

Open Your Tableau 10.0 Desktop and connect to “SuperStore-Orders” ( see screen below )


Thereafter in order to create a new connection click on “Add” ( marked in the screen above )  and connect to the Excel “SuperStore-Returns” . It automatically creates a join as shown below , you can change it to “Left Join” to see all Orders Data


Then you can click on “Add” again to create the new Join with “SuperStore-People” and default Inner Join is created that you can change to Left Join


Now you can go to sheet and work on these Different Data source with Join created . The same applies if the data instead of coming from different Excel , would have been 1 from Excel , 2nd from MS Sql Server and third from MS Access. Tableau 10.0 allows you to combine data with create cross database joins , the only pre-requisite is that the Data should be related.

In order to have a Quick Demo of  how Tableau will treat if we had the “Order” Details loaded in our Microsoft SQL Server and the Field people would have collected regarding the “Return” information in an Excel sheet and we would like Tableau to connect to both “Return.xls” and “Order” Table in Database.

So We have Microsoft SQL Server Management Studio that contains the Order Details in the table Orders$ in our TableauSuperStore database



And we have the Return Details collected by field people in the excel “SuperStore-Returns.xlsx”


We want to use Tableau 10.0 to create a Join between the Return Excel and the Orders$ table in the SQL Server Data base.

So First we connect to the SQL SERVER Express , give the data base name and user credentials and connect to Order table



There after we click on “Add” button and connect to “Return.xlsx” and Tableau directly creates a Cross Database Default Inner Join. we can change it to left Join to see all Orders Data.



There after go to worksheet and Do analytics on Related Data’s from two Different Data sources ( SQL Server and Excel ) and do the analytics very easily as if they are at the same source.

This is the Beauty of Tableau 10.0 . There are many more cool features in Tableau 10.0 which we will keep visiting one by one . For that keep watching this space and if there is any topick that you would like to be covered next , feel free to drop a message in the comment below .

If you liked the article you can put a comment below and if you have any suggestion / feedback send me an email at Or contact me at