8 Replies Latest reply on Feb 14, 2012 2:29 AM by Matt Shoemaker

    Weighted Average Calcs

    Loren Pope

      I need to do a variety of weighted average cuts on a large data set.  I’m not sure how to do this.  The ‘weighting’ process requires dividing the an individual quantity by the sum of those quantities (and the sum will change with various groups)- since you can’t  have a function that combines an individual quantity by an aggregate quantity, I don’t know how to do this without creating a multi step process.  Any suggestions are great appreciated.  Thank you! 

        • 1. Re: Weighted Average Calcs
          Alex Blakemore

          I agree this should be easier, and I'd sure like to see a general solution. But there is one special case that works now if it fits your problem.

           

          If the sum of the weights is constant (often they are designed to sum to 1 or 100 or such), then you define a

          calculated_field = [weight] * [value] / 100
          , replacing 100 with whatever the weight sum is in your case. Then AVG[calculated_field] will be the correct weighted average as long as you don't filter out rows that would change the weight sum.

           

          Maybe Tableau could add a new form of table calculation for weighted averages.

          • 2. Re: Weighted Average Calcs
            Alex Blakemore

            Here is a simpler example that works even if the weight sum is not constant. Assume your data looks like:

            Course, Credits, Grade

            Math, 3, 4.0

            Physics, 6, 4.0

            Econ, 3, 2.0

            English, 3, 1.0

             

             

            
             GPA = Sum([Credits]*[Grade])/Sum([Credits])
            
            


             

            AGG(GPA) will be 3.0 while AVG([Grade]) will be 2.75

             

            Still weighted averages might make a nice new kind of table calculation someday

            • 3. Re: Weighted Average Calcs
              Alex Blakemore

              P.S. One reason I suggest Tableau support weighted statistics via a table calc is so that the associated standard deviation, quartiles, confidence intervals could be correctly computed.

              • 4. Re: Weighted Average Calcs
                guest contributor

                I have a slightly more complex calculation need than the credit/grade example noted above.  In the typical weighted average the denominator is the formula sum ([credit]).  This works only if we want the sum([credit]) formula applied at the field/group subtotal level.  But what if we need the denominator to be based on the grand total?  The resulting values are that contributions

                 

                example data set (sales & margin).  The last column (contribution to margin) includes the value I am looking to calculate in Tableau (0.6%, 2.8%, etc).  I've included an excel file with the example of it. 

                 

                                  sales  margin  contribution to margin

                trucks    light  100      10%    0.6%

                          heavy  1000      5%    2.8%

                cars      4-door 500      7%    1.9%

                          2-door 200      12%    1.3%

                grand total      1800    6.6%  6.6%

                • 5. Re: Weighted Average Calcs
                  Richard Leeke

                  This sort of question comes up a lot.  You should be able to achieve what you want with either a calculated field which uses a RAWSQL expression, as per this posting from Erin Easter:

                   

                  http://www.tableausoftware.com/forum/computing-percent-total-using-underlying-data

                   

                  or with a custom SQL connection, like the example I posted here:

                   

                  http://www.tableausoftware.com/forum/indexing-versus-total

                   

                  I think Erin's approach is probably best here.

                  • 6. Re: Weighted Average Calcs
                    Matt Shoemaker

                    It looks like the above links are dead.

                     

                    I was looking for some info on displaying proper weighted averages on a reference line.  Specifically, I'm looking for the running sum of the total cost over the running sum of the land area, as of the last data point.

                    • 7. Re: Weighted Average Calcs
                      Matt Shoemaker

                        I was using the formula:

                       

                           Avg Price per Acre = RUNNING_SUM(AVG([Cost])/RUNNING_SUM(AVG([Land Area]))

                       

                        This gives the average of averages, which is not correct over the whole data set.

                       

                        Instead, I changed it to

                       

                           Avg Price per Acre = TOTAL(AVG([Cost])/TOTAL(AVG([Land Area]))

                       

                        This sums up all of the Cost values in the entire partition and divides by the sum of the Land Area values in the entire partition.  When I put this field onto the Level of Detail, it becomes available in the reference line field list as a table calculation.  When applied to the reference line label, this value will show a proper weighted average (or ratio in my case).

                       

                        I'm not exactly sure why the TOTAL function works here and why RUNNING_SUM does not.

                      • 8. Re: Weighted Average Calcs
                        Matt Shoemaker

                        I'm attaching a workbook showing the difference between RUNNING_SUM and TOTAL on a reference line.  There are also some notes in the .xlsx file.