1 2 Previous Next 15 Replies Latest reply on Jan 18, 2013 11:40 AM by Alex Kerin

    Calculate average of values within nested bins

    Jay Chang

      I'm sure this is a simple question but I cannot seem to figure this out.


      I have hourly pay data grouped into bins of $1 in size.


      I also have years of service grouped into bins of 2 years in size.


      I have a viz that shows hourly pay bins nested within the years of service bins.  Below is a screenshot of a part of the viz.



      Note that what I'm displaying is a histogram of # of EEs in each hourly rate bin within each years of service bin.  But what I want to calculate is the average hourly rate of all EEs who fall into,say, the 0 years of service bin.  When I tried the following table calc:


      sum([hourly rate]*[number of records])/sum([number of records])


      I get what I believe is an average of the averages across hourly bins within a given years of service bin.  Using Excel, I calculate an average hourly rate of 11.39 for the first grouping of bars, but my formula is giving me a value of around $13.


      So I have two questions.  First, is what I'm trying to do possible?  Can I group by years of service and by hourly rate but calculate an average based only on years of service groupings?  And second, how would I display such a calculation on the same viz?  I'm sure this is going to turn out to be a table calculation (and probably a simple one at that) but I'm stumped.

        • 1. Re: Calculate average of values within nested bins
          Alex Kerin

          Where are your bins coming from - do they exist in the datasource or are you calculating them?


          Edit: Doesn't matter - you can use the bins created in Tableau to partition the window_avg calculation. See attached. How you end up showing this is a different matter...

          • 2. Re: Calculate average of values within nested bins
            Jay Chang

            I used the Tableau bin feature to slot my data.  I calculated the years of service within tableau as well.  My years of service calculation basically takes the difference between 2013 and whatever the year value is in an EE's service date.  So they are all integers.  Hourly rate came from an external source that I pulled into Excel.  I created a datasource in Tableau to my Excel data.

            • 3. Re: Calculate average of values within nested bins
              Alex Kerin

              I edited my answer above

              • 4. Re: Calculate average of values within nested bins
                Jay Chang

                I do not think this is working as I would like.  I see you are using only one bin (Sales), but Customer Segment is a Dimension, rather than a bin.  An analogy using your data would be to create bins on Sales and Profit, replace the customer segment pill with the Profit (bin) pill, and then calculate an average profit within each sales bin.


                of course if I'm mis-interpreting your solution, then perhaps I'm not nearly as good with Tableau as I think I am.

                • 5. Re: Calculate average of values within nested bins
                  Jay Chang

                  To clarify:  using your technique, I can get some data to match my expectations.


                  for example, when I use the following formula:


                  window_sum(sum[hourly rate])


                  computed using pane (across then down), I get the sum of all hourly rates within the years of service bin, just as I expect.  but when I change the formula to either:


                  window_avg(sum([hourly rate])    or  window_avg(avg([hourly rate])


                  I don't get answers anywhere close to what I'd expect.

                  • 6. Re: Calculate average of values within nested bins
                    Alex Kerin

                    I can't help any more then without a tbwx (save as) to see your setup. If your data is sensitive, see if you can use the superstore demo database to create an equivalent.

                    • 8. Re: Calculate average of values within nested bins
                      Jay Chang

                      I've attached a packaged workbook using storefront that has a layout somewhat like what I've created in my own data.


                      Note that I've created a Sales bin that segments sales into groups by 15K intervals and that I've created a Profit bin that segments Profits by $500 intervals.  I'm showing counts of how many rows have profits (by bin) within sales (by bin).


                      I'm trying to calculate average profit across the profit bins within each sales bin.  Sheet 2 has the profit value I'm trying to get to within each sales bin.  I'm wondering if there is some way to calculate the values in sheet 2 using the viz showing in sheet 1.


                      Hopefully this makes more sense?

                      • 9. Re: Calculate average of values within nested bins
                        Alex Kerin

                        Great example - thanks.


                        The way to get to your calculation is to use total(avg([Profit]) and set it to only use the sales bins and ignore the profit bin.


                        The easiest way to do this is using a reference line  (as you don't even have to mess with table calcs, just make avg(profit) available on the level of detail, and set the reference line to it, and choose total from the drop down:see the first sheet




                        Or in the second, I've explicitly created this total(avg.. and plotted it on a double axis, synced the axes, set it to compute to Profit (bin) (counter intuitively to me, this effectively means ignore the profit bins when calculating the average). Finally I've used multiple marks to change the avg to a line. see the second sheet



                        EDIT: of course there's no good reason the sync the axes - in this case the numbers work out to be in a similar range. You may want to look at different ways of showing the average on the chart, but at least you have the calc to do it now.

                        • 10. Re: Calculate average of values within nested bins
                          Jay Chang

                          Hi Alex - sorry for the delayed response, I was swamped yesterday.  I implemented the formula as you describe, applying it to my situation and it is not producing the result I expect.  For whatever reason, I am still not getting an average hourly rate that matches what I calculate in Excel.  It's a very confusing thing.  Unfortunately I can't post my data, and perhaps I did not accurately create the analagous sitatuion in storefront, but the expected results do not jibe.


                          In looking over the formula, I had an overall question - why use TOTAL of AVG?  I would have thought that this would give me the average for each sub-bin within Profit and then totaled those averages within each Sales bin.  Am I not understanding that formula correctly?

                          • 11. Re: Calculate average of values within nested bins
                            Alex Kerin

                            You need to look carefully at the way that I have partitioned the table calculation - see the images above.


                            Total does not sum, it completes the aggregation (avg in this case) across whatever window we specify.


                            I can't help any more without seeing you data. If it's sensitive create some mock data with the same structure and show us where the results differ from Excel.

                            • 12. Re: Calculate average of values within nested bins
                              Jay Chang

                              SUCCESS!  THANK YOU!


                              I was using the average from the calculated field, rather than using the Average of my hourly rate field.  When I modified to follow what you had done, I got the results I was expecting.  THANK YOU.


                              Now can you please expalin (in very simple terms, cause apparently I'm a doof) exactly why/how this works?

                              • 13. Re: Calculate average of values within nested bins
                                Alex Kerin

                                No problem. Let's start at the beginning - you understand the general concept of partitioning? You could thing of having 3 discrete dimensions of fruit - type of fruit, color of fruit, shape of fruit.


                                Then I have a bunch of different fruits and I want to show their various attributes on a chart.  But I also want to give an overall average weight (my continuous measure). I may want to do this by the type of fruit (what is the average weight of apples v bananas), color (are green fruits typically heavier than yellow ones?) or shape (are spherical fruits heavier than elongated ones?). These are my three different partitions I could create.


                                In your case your bins (blue, discrete) are my different attributes. I need to set my partitions up. If I want the average across the profit bins, that needs to be on the left side of advanced compute table calculation dialog. Everything else needs to be on the right.


                                Then each member of the attribute on the right (e.g. each bin) will have it's own value of total average.


                                You are not a doofus - understanding the concept and application of table calcs is new thinking for many people. The possibilities, and their complex behavior still eludes me many times.

                                1 of 1 people found this helpful
                                • 14. Re: Calculate average of values within nested bins
                                  Jay Chang

                                  Thank you.  That explanation helped, though I'll still need to work through it in my head to "get it".


                                  Table Calcs and Partitions still remain a total mystery to me at times.  I am really hoping that one day someone will offer a course that focuses exclusively on how these things work.  Even the advanced course kind of skims over the intricacies of table calcs.  Frankly, I see a lot of potential for data processed in Tableau to be mis-interpreted because people are so used to the point-click-drag-drop simplicity of the product that they don't think about the results of the answers Tableau is providing.  They may be correct answers but they may not be the "right" answers.

                                  1 2 Previous Next