3 Replies Latest reply on Jul 20, 2012 12:54 PM by Jonathan Drummey

    Calculating Averages Across Ranges

    Ben Anderson

      This may be a simple fix, but I have not been able to find any articles relevant to my problem.


      I am comparing ages across organizations and would like to find the average age at each organization. The problem is that the ages are recorded in ranges (i.e. "20-24", "25-29"...) not by actual age. Therefore I have the age ranges set as dimensions and am obviously able to calculate the total number of employees in each range or the percentage of employees in each range by organization. Having the ability to calculate the average, median  or the top three ranges would be extremely helpful in comparing workforces across organizations and job types/levels.


      As a side note I would like to accomplish this solely in Tableau if possible. 


      I have attached an excel crosstab of a basic table made in Tableau. I think it will help. The two row consists of the age ranges.

        • 1. Re: Calculating Averages Across Ranges
          Jonathan Drummey



          There are a couple of different ways to approach this. The easiest for Tableau is to have your data in more of a "tall" format (many values) instead of "wide" (many measures). This is because when in a tall format, you can use the built-in SUM,e etc. aggregation across all values very easily, whereas when each age bucket is its own measure then you have to write calculations that include each measure. (Check out the Non-reshaped Data worksheet for how messy the calc is).


          I used the Tableau data reshaper (see the link in this excellent KB article on setting up data for Tableau: http://kb.tableausoftware.com/articles/knowledgebase/preparing-excel-files-analysis) to reshape your data. I also didn't touch the headers, which have an extra character in them.


          To convert the age range to a number, I created the "Avg within Age Range" discrete dimension with the following formula:


          IF MID([Age Range],3,1) != "-" THEN



              INT(RIGHT([Age Range],3))-2



          Then the (weighted) Average Age can be calculated with the following formula:


          SUM([Avg within Age Range]*[Value])/SUM([Value])


          You can see this in the "Reshaped Avg Age per Org" worksheet.


          Calculating the median is a bit more work because it's a weighted median and Tableau doesn't have a built-in function to do that. After building a messier calc, I found a weighted median calc in this post http://community.tableau.com/thread/114874 and created a simplified version that only requires a single formula.


          IF FIRST()==0 THEN


                  IF RUNNING_SUM(SUM([Value])) - TOTAL(SUM([Value]))/2 >= 0 THEN

                      ATTR([Avg within Age Range])





          The IF FIRST()==0 part makes sure we won't have any overlapping text. The WINDOW_MIN() extracts the median from the next part of the calc, which does the median calculation and returns the proper value of the Avg within Age Range dimension. This calc will have the Compute Using set to Avg within Age Range.


          This requires a view with Org# on Rows and Avg within Age Range on the Level of Detail Shelf. Now, if you also want to show the average age along with the weighted median, then you need a new version of the average age calc that doesn't create overlapping text with the following calc:


          IF FIRST()==0 THEN TOTAL([Average Age]) END


          This table calc also has its Compute Using set to Avg within Age.


          Finally, to show the top 3 ages for each bracket, I created the Index calc using the INDEX() function. This calc has an Advanced... Compute Using set up as follows:




          Note that there are two instances of the calc in the view. The first one is a discrete (blue) pill on the Rows Shelf and gets the sort order working right, the second instance is a continuous (green) pill with the same Compute Using settings and set to filter only 1-3.





          • 2. Re: Calculating Averages Across Ranges
            Ben Anderson



            Thank you! The advice was great, but I should have attached my actual dataset. I think the crosstab was a little misleading and I apologize for the confusion. Your formula to return the midpoint worked after some tweaking. I had to change RIGHT to LEFT and obviously then +2. This I think as my fault, if I would have sent you my actual dataset you may have caught whatever problem I was having. Everything is working perfectly now!



            Thank you!