Skip navigation
2016

Purchase Analysis

We have done various analysis from time to time to understand the different concept in Tableau. This time we will do the Purchase analysis to understand the concept of LOD.

Question:   How many Customers waited for How many Quarters to make their second purchase?

Let’s move ahead to analyze this question through Tableau.

 

Step-1:

It is very easy to analyze the various dates on which a Customer has ordered. All we have to do is Drag Customer ID, Customer Name to the Rows shelf from the Data Pane. Next, place Order Date to the Rows shelf and right click it to change the date type in the following way:

The result will be as follows:

The above viz tells us all the days when a customer has ordered. But things get tough when we want to analyze the First order date of Customers & not all the Dates or days.

 

Step-2:

Moving ahead, we want to analyze the First Order Day of each Customer. This can be done through LOD  (level of detail) expression.

Create a LOD expression by opening a calculated field with the following formula:

{FIXED [Customer ID] : MIN([Order Date])}

This formula will give us the first date of Order of every Customer as we are Fixing the Customer Id on the basis of Minimum Order Date. We get a new field First Purchase in the Data pane Dimension section.

Now, let’s make our viz  to show the FIRST Order Date of every Customer. So, drag Customer Id, Customer Name & First Purchase to the Rows Shelf. The result will be as follows:

 

Note: we need to change the DAY(First Purchase) in the same way as we changed the Order Date in Step- 1.

Step- 3:

Make another Calculated Field “ Repeated Purchase” with the following formula:

iif([Order Date]>[First Purchase ],[Order Date],null)

This will give the Repeated Purchase date of all the Customer leaving out the First Purchase Date. We get a new Data field in the Dimension section of the Data Pane “Repeated Purchase”.

Let’s add this new field to the Viz we made earlier. Drag Repeated Purchase to the Rows Shelf in the Viz next to Day(First Purchase). The result will be as below:

Note: we need to change the DAY(Repeated Purchase) in the same way as we changed the Order Date in Step- 1.

Step- 4:

Let’s make another calculated field “Second Purchase” to analyze the Second purchase date of every customer with the following formula:

{FIXED [Customer ID] : MIN([Repeated Purchase])}

We get a new field in the data pane Dimension section as “ Second Purchase”.

Let’s drag it to the previous prepared Viz to get the Second Purchase date of every Customer. Remove the Repeated Purchase field from the Rows Shelf. The result will be as below:

 

This Viz gives us the detail of the First & Second Purchase date of every Customer along with their Name & ID.

Note: we need to change the DAY(Second Purchase) in the same way as we changed the Order Date in Step- 1.

 

After we have achieved the first purchase date and Second purchase date, we have to find out the difference in Quarters between the 2 Purchase dates if any.

For this we make a calculated field- Quarters Btwn Repeated Purchase.

(I know its becoming little confusing to understand that why are we making so many Calculated fields? But please remember, this is advance level and attaining the answers to our Analytical Questions are not always easy. Please follow each and every step and send some time to understand each step before going on to the next step n I believe it will not be tough.)

 

Step- 5:

We make a calculated field “Quarters Btwn Repeated Purchase” to find out the difference in Quarters between the First & the Second Purchase. The formula that we use is:

DATEDIFF('quarter',[First Purchase ],[Second Purchase ])

We get a new field in the Dimension section of our Data pane as “ Quarters Btwn Repeated Purchase”. Let’s use this in our Viz by Dragging this field from the data Pane to the Rows Shelf and dropping it over Day(Repeated Purchase). The Result will be as below:

So now, our Viz is Showing us the Customer Id, Customer Name, the First & the Second Purchase date of the Customers and the difference in the Quarters between the First and the Second Purchase date.

Finally, we have attained the answer to a part of our Question i.e what is the difference in Quarters between the Repeated Purchase or rather how much difference is there between the two Orders by a Customer. But our Complete question was: How many Customers waited for How many Quarters to make their second purchase? So, now, we have to find out How many Customers waited for How many Quarters.

 

Step- 6:

  1. Drag Quarters Between Repeated Purchase from the data Pane to the Column Shelf. The result Viz will give us the Values from NULL to 15. Right Click on Null and Choose Exclude.

  1. Drag First Purchase from the Data Pane to the Rows Shelf. The result will be: Years(First Purchase). From the + drill down on Years(First Purchase) get Quarters(First Purchase) and remove Years(First Purchase) from the viz. The result will be as below:

Note: we need to change the QUARTER(First Purchase) in the same way as we changed the Order Date in Step- 1.

  1. Drag Customer Id from Data Pane to Color shelf in Marks Card. You will get a **** up. Choose Add all members:

The Result will be as below:But we want Distinct Count of Customers and not each Customer Individually. So Double Click on the Customer Id in the color shelf and take the cursor to the start and type CountD([Customer Id)] and hit EnterNote: Please take care of the parenthesis. You can also double click on Customer Id that is in the Color Shelf & erase it by Backspace from the keyboard and Type CountD([Customer Id)] and hit Enter from the keyboard.The Result will be as below:

  1. Now the formatting part: Click on Tooltip and make the following Changes

 

Add  Point Annotation to the Viz to finally give the finished look. And we have the answer to our Question:

If we want to do a comparative Analysis of specific versus total in the same visualization, we can make use of LOD as below.

For the better understanding of the situation analyze this from the sample superstore data set i.e. We have to analyze the Sales of our various Sub Categories in all the 4 region and also their Sales in the Selected regions.

The End Result that we are expecting is as below:

For corporate training contact at akritipurbey@gmail.com  or you can reach out to me at my linked in profile https://in.linkedin.com/in/akritipurbey

 

So The Question that we are seeking an Answer is as below

Question: How much Sales is Phones making in all the 4 Regions i.e. East, West, North, South and what is the Comparative Sales of Phones in Central & East as compared to all the 4 Regions.

Step-1:

Lets’ first make a viz with Region and Sales in Column & Sub Categories in Rows. Usually, this is what we will make when we are asked about the Sales of all the Sub Category in various Regions, right? So. The result will be like image below:

 

Now, let’s analyze the second part of our question: The Sales of various Sub Category in the Selected region- Center & East.

So, we’ll drop the Region to Filters and choose Center & East. The result will be as below:

 

But, by doing this we have missed the first part of the question: The Sales of all the Sub Category in all the 4 Region. Remember we have to analyze: How much Sales is Phones making in all the 4 Regions i.e. East, West, North, South and what is the Comparative Sales of Phones in Central & East as compared to all the 4 Regions.

That means we want a Dual Axis chart showing Sales in various Region and also in Selected Region. So, let’s move to Step 2.

 

Step-2:

Open a Calculated Field and name it ‘Sub Category Sales LOD’ with the following formula:

So, what exactly are we trying to do here is: we are Fixing the Sales of Sub Categories by this LOD Calculation. That means that whenever we use this ‘Sub Category Sales LOD’ field any filters will not effect the value of Sales as it is Fixed. So, it will give the total Sales of all the Sub Category irrespective of any filters.

 

Step-3:

Let’s make a viz using this LOD Calculation and see what happens.

In a new sheet, Drag Sub category Sales LOD & sales to Column and Sub category to Rows. Put Regions to Filter shelf and choose Center & East.

If you notice the Marks shelf you notice that there are 3 marks cards: All, Sum(Sub Category Sales LOD), Sum(Sales). Put Sub Category Sales LOD from the data pane to the labels in Sum(Sub Category Sales LOD) marks shelf and Sales from the data pane to the labels of Sum(Sales). The result will be as below:

 

Step-4:

Now Right click on Sum(Sales) on the Column shelf to open a drop down and choose dual axis. Then from All in the Marks shelf choose Bars from Automatic drop down.

 

From pane click on T to remove Labels.The result will be as below:

Also Right click on the Axis at top and select “Synchronize Axis” to get both the Axis synchronized and see the Comparison clearly as below:

Also right click on the Top Axis and de-select “show header” to remove it ,if you would like to show only one axis after the Axis are synchronized.

 

 

Step- 5:

The final formatting: drop Sales and Sub Category Sales LOD from the data pane to the Detail Shelf of All card, Click on Tooltip and make the following changes in the tooltip dialogue box.

So This completes our comparative Analysis and in one single visualization, we are able to see the sales of the selected region and also the total sales. And we finally get the answer to our Question:

 

or you can reach out to me at  akritipurbey@gmail.com   or visit my linked in profile at https://in.linkedin.com/in/akritipurbey