1 2 Previous Next 23 Replies Latest reply on Apr 17, 2018 8:10 PM by Aung Myat

    'Simple' Sankey Diagram

    mohsin.khan.1

      Hello Tableauers,

       

      I've recently discovered the brilliant blogs/tips on creating Sankey diagrams, and have successfully replicated the process to create my own. However, I actually wanted to make some, seemingly, simple tweaks to my diagrams, but seem to be stumbling along the way.

       

      The main technique that I've seen involves the left-hand bar and right-hand bar essentially being stacked bar charts, and give a percentage of the total as the determining factor for the size of each segment. What I wanted to do instead was have a very simple, equal segment sized version of this. I.e. I want to see my brand 1's on the left, and brand 2's on the right, and then have the lines going from left to right almost from and to fixed points, rather than points that are calculated due to the % of the totals. This would look easier to read, and taking it one step further along, the purpose of this is because I actually want to be able to isolate brand 1 on the left to show just 1 brand at a time, in order to see the relationships with brand 2 on the right. So that would look something like the drawing below (but better!).

       

      I've attached a package workbook to demonstrate what I have done, and hopefully the above helps to explain what I want to actually do. Any help on this would be great!

       

      Thanks!

       

        • 1. Re: 'Simple' Sankey Diagram
          Rahul Singh

          Hi Mohsin,
          I am attaching the workbook.
          Is this what you need ?

           

          Thanks
          Rahul Singh

          • 2. Re: 'Simple' Sankey Diagram
            mohsin.khan.1

            Hi Rahul,

             

            Thanks for the quick reply. That's not quite what I'm after, but looks like a step in the right direction.

             

            There are two scenarios I'm interested in:

            (1) Equal segments on the left and right sides for all available brands. Lines will follow the path as expected. I.e. the difference here would simply be that the segments would be equal sized instead of their size determined by the calculation (the Rank 1 and Rank 2 calculations: RUNNING_SUM(SUM([Count]))/TOTAL(Sum([Count])) - this is giving % of total segments)

             

            (2) Single segment for brand 1 on the left (which would be switchable via a filter), and all brand 2's on the right, of equal segment size. Which would then look like the way your packaged workbook looks with the lines stemming from a single(ish) point on the left, and going to multiple points on the right (only I want those segments it's going to on the right to not be determined by a % of total calculation).

            • 3. Re: 'Simple' Sankey Diagram
              Rahul Singh

              Hi Mohsin,
              Have a look at this one.

               

              Thanks
              Rahul Singh

              • 4. Re: 'Simple' Sankey Diagram
                mohsin.khan.1

                Hi Rahul,

                 

                Still not quite looking right. Those lines seem to be A1-A1, B2-A1, C3-A1, etc...

                They should be A1-A1, A1-B2, A1-C3 etc...

                 

                Also, the segments on the right hand side are still sized according to the % of the total count rather than being identical in size.

                The sizing on the lines also wasn't there, but I think I figured that one out (size needed to be computed by padded?)

                Finally, I'd want to be able to change the left hand bar to be A1, B2, C3, etc.

                 

                Hope that makes sense

                 

                Appreciate your time!

                 

                Thanks,

                Mohsin

                • 5. Re: 'Simple' Sankey Diagram
                  mohsin.khan.1

                  Hi All,

                   

                  I've been playing around with it a bit and got it looking a little more like I want, but I think the 'Rank 1' and 'Rank 2' formula used is causing the 'curve' formula to plot the right-hand side points in the wrong position.

                   

                  Rank 1: RUNNING_SUM(SUM([Old Rank]))/TOTAL(SUM([Old Rank]))

                  Rank 2: RUNNING_SUM(SUM([New Rank]))/TOTAL(SUM([New Rank]))

                   

                  So for my chart, I want the lines to be going to the mid-points of each segment:

                   

                  However, they don't seem to be going exactly where they should be, as you can see from the screenshot above. A1-A1, A1-B2, and A1-D4 seem to be about right, but the other lines aren't quite going to the right points, although they are roughly correct. I think the way Rank 2 is being calculated is causing the those points to be off slightly.

                   

                  Any help appreciated!

                   

                  Thanks,

                  Mohsin

                  • 6. Re: 'Simple' Sankey Diagram
                    mohsin.khan.1

                    Got it working as I wanted it now. FYI for anyone interested in it, attached the packaged workbook!

                     

                    Thanks,

                    Mohsin

                    2 of 2 people found this helpful
                    • 7. Re: 'Simple' Sankey Diagram
                      Jeff Strauss

                      Hello Sankey "experts".  I am attempting to decipher the data requirements and any specifics of the viz specs / concepts in order to implement Sankey for our own use.  And right now none of it seems to make so much sense.

                       

                      Appreciate any guidance here...

                       

                       

                      For example, why in the "t" calculation are there these hard coded #'s?

                       

                      Another example, inside the data, Padded is either 1 or 49.  What is this about?

                      • 8. Re: 'Simple' Sankey Diagram
                        mohsin.khan.1

                        Hi Jeff,

                         

                        I'm no expert myself, though have been pretty immersed in Sankey diagrams of late, so maybe I can help shed some light on these questions of yours:

                         

                        With regards to the padding - 1 & 49 basically provide Tableau with a 'start' and 'end' point. When you then use Tableau's feature to create 'bins', you are telling it to inspect the two values, and create increments between them. Tableau will give you a default split, but you go in and change it to 1, meaning it then creates these artificial points between your two real points, i.e. you then get 1,2,3,4,5,6....48,49, which are then points that can be used to plot against. Why 49? That has to do with performance. Apparently a load of numbers have been tested by others previously, and if you use too great a number, it causes slow-down and not ideal performance, whilst if you use too small a number, then your curve doesn't have enough points to draw (straight) lines to in order to make it look like a smooth line.

                        One thing to note by the way (and I don't actually understand the mechanics of this 'problem'), sometimes I've noticed that my created bins (Padded) kind of disappear again, leaving me with just 1 and 49. Personally, I've deleted my Padded dimension, and then created it again. When you drag that dimension into the columns/rows shelf, you should get all numbers from 1 to 49, not just 1 and 49.

                         

                        Hopefully someone else will be able to jump in here to help fully explain the hardcoded figures in 't', but what I do know is that they are there to help (using an effective offset) evenly space the marks across the view. A good way to track what's going on is to first see what INDEX() does to the value of t, then apply the -25 part, and then see the effects of dividing it all by 4. Tableau will explain what INDEX() does within the calcuation box too, so you could also just remove INDEX altogether to see what happens when you use t = 1 (which clearly doesn't do much for the sankey view, but I mention it as a means of stepping through each part of that formula to try and spot what it's doing to your data in real time).

                         

                        Hope that helps.

                        Mohsin

                        1 of 1 people found this helpful
                        • 9. Re: 'Simple' Sankey Diagram
                          Yuriy Fal

                          Hi Jeff,

                           

                          Please refer to this thread which I generally recommend

                          regarding Sankey / Tree / Org and the like:

                           

                          Decision trees, flow diagrams, sankeys in Tableau... here is a solution !!!

                           

                          Hope it could help a bit.

                           

                          Yours,

                          Yuri

                          1 of 1 people found this helpful
                          • 10. Re: 'Simple' Sankey Diagram
                            Jeff Strauss

                            wow.  these links are recursive, lots to read!!!! thank you!

                            • 11. Re: 'Simple' Sankey Diagram
                              Vincent Baumel

                              This one blew my mind the first time I saw it. Not sure if you have any experience in D3, but Robert Rouse did a great job incorporating it into a Tableau viz. It's one of my favorite ways to navigate through a hierarchy tree, and it might inspire you to try something similar for your own data!

                               

                              https://codepen.io/robertrouse/pen/ojvrOW

                              1 of 1 people found this helpful
                              • 12. Re: 'Simple' Sankey Diagram
                                Neerav Makwana

                                Hi Mohsin, how did you get the lines to emerge from a single point on the left?

                                • 13. Re: 'Simple' Sankey Diagram
                                  mohsin.khan.1

                                  Hi Neerav,

                                   

                                  If you click on curve drop down, go to edit table calculations, you should have your three nested calculations available in the drop down (typically, Rank 1, Rank 2, t if you've followed the naming conventions used in most tutorials). You should be commuting these using 'specific dimensions', and then select your right side, left side and padded. Sounds pretty trivial, but it really is as simple as the order of these dimensions. To get it to emerge from a single point, Rank 1 should be arranged with left side, padded, right side (rather than left side, right side, padded, which is the one typically used).

                                   

                                  Hope that helps.

                                  1 of 1 people found this helpful
                                  • 14. Re: 'Simple' Sankey Diagram
                                    Jeff Strauss

                                    I spent some quality time figuring out the Sankey, and attached is my rendition (of course with sample data).  Thank you Mohsin Khan for a very nice sample to start with.  The new one lets us pick the measure and a few other filters.

                                    1 2 Previous Next