6 Replies Latest reply on Apr 19, 2012 6:40 AM by Erling Alfheim

    Different aggregation on columns and rows in a table

    Erling Alfheim

      I'm making a table where I have categories as columns and dates as rows and then show a value for each intersection

       

      Product categories, order date and order quantity for example from the superstore example.

       

      The problem is that I would like to sum the quantity across different categories and subcategories but average them over dates.

       

      This is only a problem when you drill up or down through hierarchies (Year, month, day for dates and Category, sub-category and product name) for categories.

       

      Is there a way to set this up? so that the numbers are summed when aggregating to sub category and category, but averaged when aggregated to month and year?

        • 1. Re: Different aggregation on columns and rows in a table
          sirajmandayippurath

          Hi Erling,

           

          Have you looked at duplicating the measure by creating a calculated field called Average Order Quantity and defining this as equal to Order Quantity? You can then define the aggregation property of the new Average Order Quantity field to Average by right-clicking the measure, selecting Field Properties > Aggregation > Average.

           

          You can use both these measures on the grid now to get two different aggregations of the same base measure. Would that help?

           

          Cheers

          Siraj

          • 2. Re: Different aggregation on columns and rows in a table
            Erling Alfheim

            Possibly, but how would you put both fields on the grid?

             

            Currently I have the Order quantity on the text shelf of the marks card. As far as I can tell I cannot have more than one field there.

            • 3. Re: Different aggregation on columns and rows in a table
              sirajmandayippurath

              Hi Erling,

               

              I have attached a sample workbook to illustrate this.

               

              Is this what you need?

               

              Cheers

              Siraj

              • 4. Re: Different aggregation on columns and rows in a table
                Richard Leeke

                Nice question! I'd never thought about doing this before, but it actually works really well. Or at least I think this is what you want.

                 

                I presume in the Superstore Sales example you are looking for the average of the daily sales by category, at whatever levels of the date and product category hierarchy you are displaying?

                 

                The attached Superstore workbook example has three calculated fields:

                 

                • [Order Date (copy)] - this is on level of detail at the level of DAY() and is needed so that the daily figures are available for the averages, even if the [Order Date] on columns is rolled up to a more summary level.
                • [Daily Average Order Quantity] - this is defined as WINDOW_AVG(SUM[Order Quantity])), with Compute Using set to [Order Date (copy)]. So this gives the average over the days in each partition of the total Order Quantity. The partitions are defined by the level of roll-up of the Order Date and Product Category hierarchies
                • Is First? - this is just used to filter the view down to a single value per row and column - otherwise the values are shown (overlaid on top of each other) for all days in each partition (because of the DAY() on level of detail)

                 

                Is that what you were after?

                 

                Edit: Having posted it I thought I'd better just check if it actually works and it's not looking good.    I'll have another go...

                • 5. Re: Different aggregation on columns and rows in a table
                  Richard Leeke

                  Actually I think it is working "correctly" - just maybe not doing what you would expect. The issue is that the data is sparse - so there are not sales for every type of product on every day. So let's make it really simple and say we had just these 2 orders in January 2012.

                   

                  Date | Category | Product | Quantity

                  1 Jan 2012 | Bookcases | Metal Bookcase | 31

                  2 Jan 2012 | Bookcases | Wooden Bookcase | 31

                   

                  You might think that having sold 31 each of metal and wooden bookcases in January (31 days) the daily average orders for metal bookcases in January is 1, for wooden bookcases is 1 and overall for bookcases is 2 orders per day.

                   

                  But that workbook I just posted would tell you that the daily average for metal bookcases is 31, for wooden bookcases is 31 and overall for all bookcases is also 31. The reason for that is that when broken down by Product there were 31 orders on each of two days and no other records, so the average is 31 orders per day for which there is any data.

                   

                  It would give the answer you expect if there were zero records for every product for every day when there were no orders.

                   

                  I've attached another version with a second sheet which I think now works as expected - but I haven't checked it thoroughly. Note that the date hierarchy is now driven off level of detail.

                   

                  This relies on the way that Tableau pads missing dates and uses some table calculation tricks to turn the padded nulls into zeros for the computation (even though they still show as missing if you expand out the display). As I say, I think it now works, but I only barely understand it and I'm not sure. Definitely safer to pad your data at source if you want to try this approach.

                  1 of 1 people found this helpful
                  • 6. Re: Different aggregation on columns and rows in a table
                    Erling Alfheim

                    Thanks, that's exactly what I wanted.

                    The trick of using the second date at a day level is what did it.

                     

                    I actually wanted the first solution you came up with which ignores the null values, but I see what you did on the second one too and can see where that would be useful.

                     

                    Thank you very much