I agree this should be easier, and I'd sure like to see a general solution. But there is one special case that works now if it fits your problem.
If the sum of the weights is constant (often they are designed to sum to 1 or 100 or such), then you define a
calculated_field = [weight] * [value] / 100, replacing 100 with whatever the weight sum is in your case. Then AVG[calculated_field] will be the correct weighted average as long as you don't filter out rows that would change the weight sum.
Maybe Tableau could add a new form of table calculation for weighted averages.
Here is a simpler example that works even if the weight sum is not constant. Assume your data looks like:
Course, Credits, Grade
Math, 3, 4.0
Physics, 6, 4.0
Econ, 3, 2.0
English, 3, 1.0
GPA = Sum([Credits]*[Grade])/Sum([Credits])
AGG(GPA) will be 3.0 while AVG([Grade]) will be 2.75
Still weighted averages might make a nice new kind of table calculation someday
P.S. One reason I suggest Tableau support weighted statistics via a table calc is so that the associated standard deviation, quartiles, confidence intervals could be correctly computed.
I have a slightly more complex calculation need than the credit/grade example noted above. In the typical weighted average the denominator is the formula sum ([credit]). This works only if we want the sum([credit]) formula applied at the field/group subtotal level. But what if we need the denominator to be based on the grand total? The resulting values are that contributions
example data set (sales & margin). The last column (contribution to margin) includes the value I am looking to calculate in Tableau (0.6%, 2.8%, etc). I've included an excel file with the example of it.
sales margin contribution to margin
trucks light 100 10% 0.6%
heavy 1000 5% 2.8%
cars 4-door 500 7% 1.9%
2-door 200 12% 1.3%
grand total 1800 6.6% 6.6%
tableau calc help.xls 13.5 KB
This sort of question comes up a lot. You should be able to achieve what you want with either a calculated field which uses a RAWSQL expression, as per this posting from Erin Easter:
or with a custom SQL connection, like the example I posted here:
I think Erin's approach is probably best here.
It looks like the above links are dead.
I was looking for some info on displaying proper weighted averages on a reference line. Specifically, I'm looking for the running sum of the total cost over the running sum of the land area, as of the last data point.
I was using the formula:
Avg Price per Acre = RUNNING_SUM(AVG([Cost])/RUNNING_SUM(AVG([Land Area]))
This gives the average of averages, which is not correct over the whole data set.
Instead, I changed it to
Avg Price per Acre = TOTAL(AVG([Cost])/TOTAL(AVG([Land Area]))
This sums up all of the Cost values in the entire partition and divides by the sum of the Land Area values in the entire partition. When I put this field onto the Level of Detail, it becomes available in the reference line field list as a table calculation. When applied to the reference line label, this value will show a proper weighted average (or ratio in my case).
I'm not exactly sure why the TOTAL function works here and why RUNNING_SUM does not.