1 2 3 Previous Next 41 Replies Latest reply on Sep 27, 2017 10:18 PM by Rajesh Binani

    Decision trees, flow diagrams, sankeys in Tableau... here is a solution !!!

    Olivier CATHERIN

      In this post, we will show how to build a decision tree with Tableau. The goal, as usual, is to do it with a minimum of data preparation. For this example, we will use the superstore dataset provided with the Tableau installation. We would like to build the following tree:

      • Level 0: Starting point
      • Level 1: Order Priority
      • Level 2: Ship Mode
      • Level 3: Product Container

       

      Important: in order to go through this post, you may also read the following post : Sankey diagram made of dynamically generated polygons

       

      It will look like this :

      https://public.tableausoftware.com/static/images/Po/PolygonicDecisionTreeViz/PolyDecisionTree/1.png

       

      In the attached workbook, we have made a lot of interactivity. We can choose :

      • The dimension to use in Level 1, 2 and 3
      • Data is filtered by Year
      • The flow size can be defined chosing an indicator
      • The color of the flow can be allocated to another indicator
      • Tooltip will show the detailed information.

       

      In this post, we are not going to detail the whole step-by-step prcedure but rather explain the logic of calculation in order to build the Viz.

       

      Data Preparation :

      Let’s first build the dataset. For this purpose, we will use the superstore Dataset that we will blend with the polygonic dataset (we will make it polygonic !). The result file is attached. We removed all the data from it that is not required and added the « Link » column in the superstore data table.

       

      Understanding the logic :

      Let's first build it into Powerpoint to better understand the logic :

      understanding 1.png

       

      Basically what we did here is to build a visualization of a hierachy. Let's decompose this flow chart into different parts and let's add axes :

      understanding 2.png

       

      If we want to build it into Tableau, we have to find out how to position the dimension at the different levels in the viz ie. (x,y) coordinates. Let's decompose it in 3 steps A, B and C.

      Levels will define the x positions of the hierarchy details, while Y is defined by Position 0 to Position 3, where the Position is defined by a ranking of the Level details in order to show the appropriate flows.

       

      In our method, X will be defined by our polygonic model and will build the curves for each part A, B and C.

       

      Basically, the level 3 should be a ranking of each individual product container for a certain ship mode and a certain order priority. We could represent it in this way :

      understanding 3.png

      The issue is that Tableau can easily represent it in a table or a Tree Map but we have to cheat a bit in order to show it in a Graph that looks like a decision tree.

      Treemap.jpg

      Problem : We have to define how to sort the dimensions at the different levels in order to draw our curves.

       

      Solution : INDEX and Advanced Table Calculations !

      Index will give us the ranking, and advanced table calculations, the level at which curves should be grouped.

       

      To better understand, let's build 4 indicators :

      Index 0 = INDEX() computed along Level 0, Level 1, Level 2, Level 3 at the level « Level 0 ».

      The Level 0 will be the starting point the whole data set. We will just create a Calculated Field "Level 0" that equals to 'Start'. This enable us to build a viz starting from a single point, whatever the other levels are.

       

      To do so, create a new calculated field :

      1. Name it Index 0
      2. Function : INDEX()
      3. Click on « Default Table Calculation » on the upper right hand side of the calculated field window. In « Compute using » choose « advanced… » and build the "adressing" this way: Level 0 > Level 1 > Level 2 > Level 3 (be sure to keep this order !)

      adv calc field 1.png

      Click « OK » and choose Level 0 in the « At the level » drop down menu. Click OK again.

       

      Repeat this operation for Index 1, 2 and 3. The only difference is that we will define the calculation respectively at levels 1, 2 and 3.

       

      When building a table with this information, you should find the following :

      understanding 5.png

      We have defined the right ranking of our elements at each level of the hierarchy.

       

      Now we would like to build a distribution of these points in order to define our Y’s in our viz. This should be calibrated to fit the same range of data, so that curves split instead of creating a waterfall. Easiest is to set the position of a range from 0 to 1.

       

      Let's then define our positions :

       

      Position N :

      INDEX()/(SIZE()+1)

      Computed along Level 0, Level 1, Level 2, Level 3 at the Level N

      • INDEX will give us the ranking;
      • SIZE will give the total number of unique Index in the partition;
      • +1 in order to have a good distribution of our points in a 0 to 1 range.

       

      In our above example :

      • Hierarchy : Critical / Delivery Truck / Jumbo Drum
      • Index at Level 3 is 2.
      • Position 3 will be 2/(12+1) = 2/13

       

      As soon as we have done this, we can build our viz using the same method that we used for the polygonic Sankey (see link to our post or VizWiz).

       

      We can then build our Curves A, B and C and the final Viz (either in a dashboard or in a single workbook).

      Decision tree simple.jpg

       

      Note : In Tableau, using Table Calc sometimes require that you update pills in the viz. If you don’t get the right viz, simply drag and replace indicators in the viz to update the view.

       

      Have it polygonic !

      To build a decision tree using polygons, we require a bit more work and a different method than the polygonic Sankey to define the levels :

       

      • Size : SUM([Choose Indicator]) / TOTAL(SUM([Choose Indicator]))

                Computed using Level 0, Level 1, Level 2 and Level 3

       

      • Position N Max : [Position N] + RUNNING_SUM([Size])

       

      • Position N Min : [Position N Max] - ([Size])

       

      This will sort the data on a 0 to 2 axis

       

      The remainder is exactly the same as for the Sankey Diagram.

      Decision tree polygonic.jpg

       

      Have fun !

      Now that you know how to use adressing to customize your viz, just play with it in order to build new flow diagrams !

       

      For any questions, please contact us : Olivier CATHERIN

       

      Cheers

       

      Olivier

       

      Ce message a été modifié par : Olivier CATHERIN I just mad a little correction in the formulas used : replacing WINDOW_MAX by SIZE will simplify the calculation and avoid issues when filtering the view.

       

      Ce message a été modifié par : Olivier CATHERIN Hello I just added a new version using LOD in the flow size calculation to show results as percentage of the total flow whatever the selection is. Cheers

        1 2 3 Previous Next