TOTAL() vs WINDOW_SUM()

Version 7

    [Updated 2019-05-31 by Jonathan Drummey to include all comments to date with some new material. Document originally by Patrick Van Der Hyde with help from Dan Huff. Also thanks to Joe Mako.]

     

    It's fairly common for the terms "total" and "sum" to be used interchangeably, and Tableau has two functions TOTAL() and WINDOW_SUM() that can look like they are doing the same thing:

     

    Screen Shot 2019-05-31 at 9.26.09 AM.png

     

    However there are differences between TOTAL() and WINDOW_SUM(), and one of the more common places we run into it is when we use a non-additive aggregation like count distinct, median, or percentile. Here's an example of that, the TOTAL(COUNTD([City])) is returning 1523 while the WINDOW_SUM(COUNTD([City])) is returning 1727:

     

    Screen Shot 2019-05-31 at 9.35.14 AM.png

     

    There are three parts to explain the difference: the data, the type of calculation (TOTAL vs. WINDOW_SUM), and the compute using (aka addressing) of the calculations. I'll start with the compute using, then go back to the data, then the type of calculation.

     

    The compute using of both of the table calculations is Table (Down) which means that they are computing along (or addressing) all of the dimensions on Rows (State in this case). There are no other dimensions in the view so the calculations won't be partitioned (or restarted) on anything else. Therefore in this view TOTAL() will return the same value for every mark, and the way the WINDOW_SUM() is set up (without any offsets) it will also return the same value for every mark.

     

    Now to talk about the data...in this data set there are 1,523 distinct values of City:

     

    Screen Shot 2019-05-31 at 9.37.28 AM.png

     

    So in the earlier view the TOTAL(COUNTD([City])) is counting the number of distinct values of city across the whole data set and returning that same 1,523 number. Another way to think about what TOTAL() is doing in relation to the dimensions used for the compute using is that TOTAL() is going to "ignore" the dimension(s) used for addressing and instead compute the given measure across all the records inside the particular partition. In the earlier view TOTAL() has a Compute Using of State so it "ignores" State (the only dimension in the viz) and computes the COUNTD([City]) across the entire data source.

     

    Here's another example to make this more clear. The same Total calculation is used in a view with State and City, only the Total calculation has a compute using on the City (therefore partitioning on State). The COUNTD([City]) is now computed for each State while returning the value to each mark:

     

    Screen Shot 2019-05-31 at 10.02.06 AM.png

     

    Now for the WINDOW_SUM(). The way to think about the WINDOW calculations is that they operate on the marks in the view (before any table calculation filters are applied) and they perform the given aggregation defined by the function i.e. WINDOW_SUM, WINDOW_COUNT, WINDOW_MIN, etc. on the inner aggregation. So in this case we've got a WINDOW_SUM(COUNTD([City])). Tableau is going to compute the COUNTD([City]) at the viz level of detail (i.e. based on the dimensions in the view) for each mark and then compute the WINDOW_SUM() across those marks based on the compute using of the calculation. So in this view from earlier the WS (short for Window Sum) CountD of City is adding up 1, 1, 1, 22, 1, 1, 1, 23, etc. values of the COUNTD([City]) and the sum of all of those adds up to 1,727:

     

    Screen Shot 2019-05-31 at 9.35.14 AM.png

     

     

    So here's a shorthand for thinking about the differences between TOTAL() and WINDOW_SUM()

     

    • TOTAL() is going to use the inner aggregation and compute that at the level of detail determined by the dimensions in the view and the compute using of the calculation, i.e. "an aggregate at a different level"
    • WINDOW_SUM(), WINDOW_MEDIAN(), WINDOW_MIN(), etc. are aggregations (indicated by the _SUM, _MEDIAN, etc.) of the inner aggregation, i.e. "an aggregate of an aggregate"

     

    There are a number of other features in Tableau that act like TOTAL() or WINDOW_SUM() or both, here are the main examples:

     

    • Totals: Tableau's default Automatic grand totals and sub totals act like a TOTAL() calculation, whereas the Total Using-> Sum makes the grand total behave like a WINDOW_SUM(). For more information see the posts at Grand Totals and Subtotals | Drawing with Numbers
    • Reference Lines/Bands/Distributions: Tableau's reference lines & bands & distributions support a variety of aggregations. The Total aggregation acts like a TOTAL() calculation, whereas the Sum acts like a WINDOW_SUM(). This is an important distinction for ratio calculations like Profit Ratio SUM([Profit])/SUM([Sales]) or a Positivity Rate SUM([Positive Cases])/SUM([Tested Cases]) where a Total reference line would be the "population result" or whereas an Average reference line would be the average of the displayed values (i.e. not computed on the underlying population).
    • Level of Detail (LOD) Expressions: We can use FIXED or EXCLUDE LOD expressions in calculated fields to get the same numbers as a TOTAL() calculated field, and nest LOD expressions to get the same results as a WINDOW_SUM().
    • Data Blending: Through choosing the dimensions used as linking dimensions we can generate results that are like a TOTAL() calculated field with a regular aggregate coming from a data blend. Before LOD expressions existed this was a useful way to do aggregations at a higher level without having to resort to custom SQL or other pre-aggregation.
    • Title, Caption, Summary Card, Status Bar, and tooltips when multiple marks are selected - In all of these Tableau will aggregate the marks like a WINDOW_ calculation.

     

    Knowing how these features work can really speed up one's use of Tableau. For example when asked to compute the average of the daily sum of sales for Sundays we could write LOD expressions or table calculations, or get the answer in 6 clicks:

     

    foo.png

     

     

    There are a variety of other differences between how the TOTAL() and WINDOW_SUM() functions work:

     

    • WINDOW_ functions can take offsets, like WINDOW_SUM(SUM([Sales]), -2, 0) to create a moving sum of the current and prior 2 values of SUM([Sales]). TOTAL() calculations do not. Therefore the results of WINDOW_ functions can change based on the sort order of the dimension(s) used for the compute using, whereas TOTAL() calculations will not change based on the sort order.
    • Because of the offsets WINDOW_ calculations can be separately computed for every single mark, whereas TOTAL() calcs are computed once for each partition. This can create performance differences where TOTAL() can be faster (sometimes a lot faster) than WINDOW_SUM().
    • Tableau does as much as it can to optimize queries, make use of local caches, etc. for highest performance. WINDOW_ calculations won't generate extra queries to the data source, whereas TOTAL() calculations using non-additive aggregates like median, count distinct, and percentile can generate extra queries in order to return accurate results and therefore potentially affect performance. (This statement about extra queries also applies to non-additive aggregates in both totals using the Automatic behavior and reference lines using the Total aggregation). Here's an example:

      bar.png

      Related to this point, note that in some Tableau materials you'll see "Table calculations are computed locally". There are two interpretations of this statement; the one that is used in this case is that "Tableau is computing table calculations based on the query(ies) to the data source." Another interpretation is "Table calculations don't require any extra queries to the data source", which in the case of TOTAL() can be inaccurate in those situations like the % of Total, 2 queries view above where the % of Total computing across all of the data is actually causing an extra query to be generated.

    • TOTAL() cannot be computed on a blended aggregate calc (for example one that is summing from two different data sources), whereas WINDOW_ calculations can.
    • TOTAL() won't trigger data densification behaviors on its own, whereas WINDOW_ calculations can. Here's an example:
      Densification.png
    • You can nest the results of TOTAL() as a child inside another table calculation, however you can't go the other way. In other words, you can do WINDOW_SUM([table calc using TOTAL()] ) but not TOTAL([table calc using WINDOW_SUM()]).
    • TOTAL() can't be computed on a measure using ATTR(). This is because ATTR() is computed inside Tableau, whereas the measure for TOTAL() is typically being computed in the data source, or at least in the pipeline before ATTR()'ed results are available.

     

    One additional source of confusion that occasionally comes up is when we're using a % of Total inside another calculation on a user-defined aggregation. In this view the first measure is COUNTD([Category]) and each Customer has between 1 and 3 categories. Then using the Quick Table Calculation Running Total with a secondary % of Total to do a % of Total Running Total we end up with numbers between 67 and 500%, instead of the typically expected 0-100% range:

     

    Screen Shot 2019-05-31 at 11.11.48 AM.png

     

    In this case our starting expectation might be that we've got a sum of 15 distinct counts so the first mark should be 2/15 = 13.33%, the second 3/15 = 20.00%, and so on. Here's a view with additional measures to explain what is going on:

     

    Screen Shot 2019-05-31 at 11.12.44 AM.png

     

    The first two measures repeat from the earlier view. The Running Sum of Countd Category and Total Countd of Category are the two measures that Tableau is using inside the % of Total Running Sum quick table calculation. The Running Sum is pretty clearly adding the current mark's CountD to the sum of prior marks and accumulating values throughout the viz. The Total Countd is doing TOTAL(COUNTD([Category])) across all of the data, though, and there are only 3 categories in the data. Therefore the % of Total Running Sum that is the Running Sumof CountD/Total CountD is returning 2/3 = 66.67% for the first mark, 3/3 = 100.00% for the second mark, and so on.

     

    In this case our starting expectation is that the denominator would be the sum of the count distincts, i.e. an aggregation of an aggregation that would add up to 15, which is what the WINDOW functions do. The Window Sum of Countd has the formula WINDOW_SUM(COUNTD([Category])) and returns the desired 15. Then the % of Total RS/WS calculation is the Running Sum of CountD/Window Sum of Countd and that returns the desired 2/15 = 13.33%, 3/15 = 20.00%, and so on.

     

    A Tableau v10.5 workbook is attached with all the screenshots used in this document.