7 Replies Latest reply on Oct 31, 2015 4:54 AM by jonathan.o'grady.0

# Follow customer behaviour with Sankey

Dear fellow Tableau fans!

I am throwing myself on the benevolence of the community looking for some help to put together my first Sankey diagram.

There is quite a bit on these pages regarding assembling such a diagram, but I seem to be missing a step and have got myself into a bit of a muddle… In short, I'm stuck!

Please find attached a packaged workbook of retail data.

I have customer orders by retailer, by category and subcategory, by date.

I would like to analyse how customers switch between retailers within a category/subcategory over two periods.

At this stage, I'm only interested in the number of customers, not how much they spent. So for example:

In period one, customer 123 records a transaction with retailer A in category 1

In the contrast in period, the same customer records transactions in the same category, but this time with both retailer A and retailer B

In my Sankey, there will be a stacked bar of all customers by retailer on the left-hand side, representing period 1. There will then be a stacked bar of all customers by retailer on the right-hand side, representing period 2.

In the above example, customer 123's order is recorded as a single unit with retailer a on the left-hand side, which is then split into two half units on the right-hand side - one half unit flowing to retailer A and the second half unit flowing to retailer B.

Likewise if a customer purchases a subcategory from two retailers in the first period, but only from one retailer in the second period, this will be two half units on the left-hand side flowing into one full unit on the right inside.

In this way, if you start with 1000 customer orders  on the left-hand side, you should end up with 1000 customer orders on the right-hand side. The only difference will be how they are apportioned to each retailer.

In aggregate, this should display customers who first shopped with retailer A and continue to shop with retailer A, or if they switch to other retailers.

I would like to try and do as much of this in tableau as possible, as the ultimate point of delivery will be served to stakeholders via tableau online, updating daily.

In the attached workbook you will see that I have created dynamic sets to capture customers by period. This appeared to be working well, but unfortunately there are some periods where, at the subcategory level, a customer does not purchase in that subcategory in both periods. Try as I may, I seem unable to filter out these instances, resulting in uneven stacked bars.

And all of this before I even get to the complicated curved lines in the middle!

I know that this will be a very effective visualisation (know? Hope!), but I'm going to need a small bit of help from some of the more experienced users here…

As always, very grateful for all suggestions. Pint of Guinness in Dublin on me for the most helpful answer!

Best wishes,

Jonathan

• ###### 1. Re: Follow customer behaviour with Sankey

Jonathan,

the calculations required were a bit tricky, so I please find below the workbook with the edits.

In a nutshell, the technique is to use level of details (see below for full reference on this very interesting topic) to compute the aggregates at different levels (per customer, then per customer per retail).

Understanding Level of Detail (LOD) Expressions | Tableau Software

I also took the opportunity to correct the period calculations: the "- 7" were applied to the week numbers, which mean the 7 last week (and not the days). Remember to count the closing parenthesis!

Regards

1 of 1 people found this helpful
• ###### 2. Re: Follow customer behaviour with Sankey

Romain,

First of all, thank you very much for looking into this. I'm immensely grateful.

So you have managed to balance up the left-hand side and the right-hand side, but do you have any idea how I can calculate the polygon lines connecting both sides? This is the bit of the Sankey I just can't seem to work out from the descriptions on this form…

Best wishes,

Jonathan

• ###### 3. Re: Follow customer behaviour with Sankey

Dear all,

Reaching out to the tableau community to see if anybody could give me some pointers as to how to create the "Sankey flows" in the above packaged workbook?

We have worked out the "pillars" that go on the left-hand side and right-hand side, but I am unable to understand how to set up the data to make the flows work.

Would really appreciate some guidance please.

Much appreciated,

Jonathan

• ###### 4. Re: Follow customer behaviour with Sankey

Hi Jonathan,

I think you have started the wrong way. You should first build the flows before the bars. At this point, you miss the data prep :

• you need to build the densification : this will give you the shape for your flows
• You should rather reshape you data this way :
• Column 1 : Customer ID
• Column 2 : Retailer in T0
• Column 3 : Retailer in T1…

Such a reshaping will be easier for the Sankey drawing even if I am quite sure that you may succeed with the current data format.

All the steps are described in these posts :

Sankey diagram made of dynamically generated polygons

Decision trees, flow diagrams, sankeys in Tableau... here is a solution !!!

Let me know if this helps !

Olivier

1 of 1 people found this helpful
• ###### 5. Re: Follow customer behaviour with Sankey

Dear Olivier and others,

Your help is being most beneficial, and I have made some progress making my first Sankey… But I'm not there yet!

As you can see from the attached workbook, I'm getting close, but I am making a silly mistake somewhere.

Would really appreciate any kind assistance - I'm going to beat this one way or another!

Best wishes,

Jonathan

• ###### 6. Re: Follow customer behaviour with Sankey

I haven't had a chance to dig into the 42M rows, the multi-layered structure and the various LOD calculations used here. This is quite complex.

However, the calculation used in all of the Curve A calculated fields doesn't look correct.

You have:

[T  -1 position]+([T0 position]-[T0 position])*ATTR([Sigmoid])

Check the instructions here:

[2012 Ranking] + (([2013 Ranking]-[2012 Ranking])*[Sigmoid Function])

This translates to:

[Position 1] + (([Position 2] - [Position 1]) * [Sigmoid Function])

I'll try to dig into this more and I'll let you know what I see.

Jeff

1 of 1 people found this helpful
• ###### 7. Re: Follow customer behaviour with Sankey

Jeff and team,

Thank you for your very helpful comments. You are correct, I have invested a lot of time and this… Too much time! But I'm determined to get it over the line. I know once I figure it out it will be something I can use again and again and again.

I have a go at designing the very simple model first, and see if that gets me closer to the line.

In the meantime, if you have any other helpful comments or suggestions, do please let me know! I would be eternally grateful.

Best wishes,

Jonathan