5 Replies Latest reply on Feb 3, 2013 11:03 PM by Michael Mixon

    Has anyone (easily) created bandlines in Tableau?

    Michael Mixon

      Hi,

       

      I read Stephen Few's article on Bandlines (http://www.perceptualedge.com/articles/visual_business_intelligence/introducing_bandlines.pdf), and then attempted to create some in Tableau.  The quartile piece is straightforward (i.e. just use the available reference line option), but I got hung up on how to calculate the outliers, as there are no available calculations that would get me the 25th and 75th percentiles, which I would need in order to determine if one of the values is an outlier.  I've seen some posts on how to calculate percentiles, but they either seem fairly complicated or address other items related to percentiles.  In short, is there a way to use a table calculation to arrive at the same numbers the reference bands generate?

       

      I've attached a very basic workbook with the numbers I'm looking for.  Any insight/help would be appreciated.

       

      -Mike

        • 1. Re: Has anyone (easily) created bandlines in Tableau?
          Jim Wahl

          Well I can't say it was easy, but I managed to show the outlier points in the attached.

           

          Having played with bandlines a bit, I'm not sure the outlier points are critical (with some datasets, anyway).

           

          Few doesn't use them in his example, albeit the shaded bands in the grade book dashboard are not based on box-plot-style quartiles. And the example by XLCubed he recently cited uses points for min/max, not outliers.

           

          And, in your Tableau workbook, you calculated the reference bands based on the panes, in which case the midrange is always present and provides context for the line's distance into Q1 or Q4.

           

          If you instead calculated the reference bands for the table, it's possible that---with a different dataset---some of the panes would have values only in one or two quartiles. In this case the outlier points help distinguish otherwise identical bandlines.

           

          For example if you had data with quartiles at 45, 50, and 55, then these two, five-point datasets would look identical:

          • 60, 61, 62, 63, 64 and
          • 60, 65, 70, 75, 80.

           

          The outlier points on 70, 75, and 80 would highlight the difference in y-axis scale.

           

          Of course, you could just add high / low text labels.

           

          I showed a few different examples bandlines, bandlines with sparkstrips (Few's horizontal box plots, next to the bandlines), and a more traditional sparkline. Everything was easy, except for the outlier points.

           

          Jim

          • 2. Re: Has anyone (easily) created bandlines in Tableau?
            Michael Mixon

            Thanks so much Jim. Your explanations and calculations were very clear and helped me past the logic blocks I was having.  In particular, I would never have thought to use the index and size functions to determine where the 75% and 25% positions were and then just take the average of the two numbers that border that position. Really elegant.

             

            In reviewing your logic, though, a few more questions emerged, primarily around the syntax you used. Some of it seems extraneous to me (i.e. I was able to get the same results with simpler syntax), so I’d like to better understand its purpose.  An example is below.

             

            Your calculation for Sales (75th percentile) is:

             

            IF FIRST() == 0 THEN

                WINDOW_MIN(

                    IF 0.75 * SIZE() = INDEX() THEN

                        (SUM([Sales]) + LOOKUP(SUM([Sales]), 1)) / 2

                    ELSEIF 0.75 * SIZE() < INDEX() THEN

                        SUM([Sales])

                    ELSE

                        NULL

                    END,

                    0,

                    IIF(FIRST()==0, LAST(), 0)

                )

            END

             

            I was able to achieve the same results with the following calculation:

             

            window_min(IF 0.75 * SIZE() = INDEX() THEN

                        (SUM([Sales]) + LOOKUP(SUM([Sales]), 1)) / 2

                    ELSE

                        NULL

                    END)

             

            So it isn’t clear to me what the portions in blue are doing. 

             

            Another question I have is about your calculation for the lower outlier threshold.  Per Few’s description ("A distance of 1.5 times the midspread above the 75th percentile and below the 25th percentile"), this would mean that the lower threshold should also be 1.5x the same $36,557 IQR, which would put it at $4,171 vs. the $22,469 you calculated.  Am I interpreting his description incorrectly?

             

            I still intend to look through the rest of your tabs, but I wanted to reply to the core items and to let you know how much I appreciate your thorough examples.

             

            -Mike

            1 of 1 people found this helpful
            • 3. Re: Has anyone (easily) created bandlines in Tableau?
              Jim Wahl

              Glad it helped. I'd be interested to hear how well bandlines work for you.

               

              The IF FIRST()==0 and IIF(FIRST()==0, LAST(), 0) are part of Richard Leeke's original percentile solution to optimize the performance. He explained it well here (http://goo.gl/Ww0eA); as I understand it, the WINDOW_xxx functions are slow, because they (in this example) repeat the sort for every partition. Since the answer is the same for all partitions, it makes sense to just run it on the first partition (FIRST()=0).

               

              But Tableau apparently evaluates the WINDOW_MIN() part first and, therefore, the FIRST() trick doesn't improve the speed---Tableau calculates, but ignores the subsequent results. So Richard's second trick was to specify a single-value range for all but the first partition using the IFF(FIRST()==0, LAST(), 0)---the resulting window calculation is meaningless, but it's being ignored.

               

              Performance isn't an issue with this example, but FIRST()==0 also solves the overlapping text problem, if you wanted to include the value in a table.

               

              The ELSEIF 0.75 * SIZE() < INDEX() THEN  SUM([Sales]) part is required if the percentile is one of the values in the data set. For example, when calculating the median of a dataset with 11 values.

              • Dataset A with 10 values: 1 2 3 4 5 6 7 8 9 10 --- median is 5.5
              • Dataset B with 11 values: 1 2 3 4 5 6 7 8 9 10 11 --- median is 6

               

              Another question I have is about your calculation for the lower outlier threshold.  Per Few’s description ("A distance of 1.5 times the midspread above the 75th percentile and below the 25th percentile"), this would mean that the lower threshold should also be 1.5x the same $36,557 IQR, which would put it at $4,171 vs. the $22,469 you calculated. 

               

              Good catch, thanks, it should be $4,171.

               

              Jim

              • 4. Re: Has anyone (easily) created bandlines in Tableau?
                Jim Wahl

                I prettied up the workbook above and put the bandlines on Tableau public here:

                http://public.tableausoftware.com/views/bandlines/0-Intro?:embed=y

                 

                Stephen Few also picked up the fact Tableau doesn't allow you to put the median reference line in the background, which causes breaks in the sparkline. His comment is here: http://www.perceptualedge.com/blog/?p=1485

                • 5. Re: Has anyone (easily) created bandlines in Tableau?
                  Michael Mixon

                  Nice work, Jim.

                   

                  I created a variation of this info for internal use (see attached).  Thanks again for your help with this.