1 Reply Latest reply on Aug 30, 2016 4:32 PM by Keshia Rose

    Dynamic Sizing of Histogram Bin Widths

    Graham Landry

      Hello Everyone!

       

      To this post I've attached a packaged workbook with staged data so that you can follow along with my problem description...

       

      I'm in the process of performing a Recency-Frequency-Monetary (RFM) analysis and would like to create dual-axis histograms for each variable showing frequency counts per bin on the right axis and the cumulative percent of total on the left axis. I've created a parameter (called Purchase Attribute) that allows the user to dynamically select one of the three RFM variables, which links to a calculated field (called Select Purchase Attribute) that allows for the parameter control to dictate which variable is plotted.

       

      Instead of using the default binning algorithms in Tableau, I'd like to use the Freedman-Diaconis rule for determining bin width. This rule asserts that the bin width for a continuous variable is equal to [2 * (IQR/n^(1/3))] where IQR is the interquartile range and n is the number of records. I've implement this equation with a calculated field (called FD Bin Size) which manually calculates the IQR and divides by the cube root of the number of records (count of ID field):

       

           2 * ((PERCENTILE([Select Purchase Attribute], 0.75) - PERCENTILE([Select Purchase Attribute], 0.25)) / POWER(COUNT([ID]), 1/3))

       

      My strategy here was to link the Select Purchase Attribute calculated field to the FD Bin Size calculated field so that the bin-width would be automatically calculated and used for plotting once the user chose a variable for viewing. Then (I thought) I'd be able to edit the Select Purchase Attribute (bin) dimension so that the bin size used for plotting was determined by the FD Bin Size calculated value...which is where my dilemma resides. I've verified with my work in R that the FD Bin Size calculation in Tableau is correct, but I've not found a way to link this dynamic calculation to the Select Purchase Attribute (bin) field.

       

      Any Tableau Jedis out there who can help show me the light? It would be much appreciated!

       

      Here's some more info on the parameters and calculated fields:

        • 1. Re: Dynamic Sizing of Histogram Bin Widths
          Keshia Rose

          Hi Graham,

           

          You can create your own custom bins with a calculation. Here is what I've used in the past that should work for your use case:

          INT([Select Purchase Attribute]/[FD Bin Size])*[FD Bin Size]

           

          To make this calculation valid, both measures have to be either disaggregated or aggregated, not a mix of both. In your case since there is just one single number for each RFM measure, I just added LOD syntax around your FD Bin Size calculation:

          { 2 * ( (PERCENTILE([Select Purchase Attribute], 0.75) - PERCENTILE([Select Purchase Attribute], 0.25)) / POWER(COUNT([ID]), 1/3) ) }

           

          This will calculate the bin width for the entire data set and return a disaggregated number. More on LOD's here if you are unfamiliar: Overview: Level of Detail Expressions

           

          Now that we have the bin calculation we can place it onto the column shelf to create the histogram.

           

          One thing I did note, the calculated bin size for frequency was .1366 which when used in this calculation does not return logical bins. I would suggest using the CEILING() around the bin calc to make sure the number is always at least 1. Example:

          INT([Select Purchase Attribute]/CEILING([FD Bin Size]))*CEILING([FD Bin Size])

           

          I've attached a modified version of the workbook with these calculations. Let me know if this works out for you!

           

          Take care,

          Keshia

          1 of 1 people found this helpful