1 Reply Latest reply on Dec 7, 2015 5:34 AM by Ben Neville

    need validation on how i think about calcs

    ben kwok

      for the longest time i have been confused about calculations, despite being able to use them, and recently i found a new way to think about it that made things click for me. my question is, is my logic correct? that's what i'm looking for from the community.


      what's confused me about calculations is that i had a hard time visualizing the connection between the calc and the original data source. that is until i started thinking about calculations like adding a new column in my data source that had a given calculation. sort of like adding a new column in an excel spreadsheet where the formula i type is copied all the way down. so by this logic, using the superstore set...


      this calc in tableau:

      if [Category] = "Office Supplies" then [Sales] else 0 end


      is equivalent to adding a new column in Excel and typing and copying this formula for all rows:

      IF(N2 = "Office Supplies", H2,0)


      where N2 is Category and H2 is Sales.


      i'm probably not doing a good job of explaining why i get confused. this happens not with something simple like dividing two measures but when using the logic formulas. since i'm familiar with how calcs are done in Excel, my mind naturally wanted to relate the methods between the two products.

        • 1. Re: need validation on how i think about calcs
          Ben Neville

          Hi Ben - let me see if I can help clarify things a bit.


          Yes - conceptually you are correct (except for aggregate/Table/LoD calculations - those can be a different beast). Think of any data manipulation (not visualization/interaction) in Tableau as a data layer between your data and the end user. Tableau reads from your data and calculates result as needed. Sometimes these are in the datasource in the event of an extract, and sometimes they sit on top of your data, but no matter what, they are computing up on the source data.


          To help illustrate this, take an extract of your datasource. Anything that we can materialize - i.e. anything not aggregated, will be materialized in the row/cell. You can demonstrate this by using something like the Superstore dataset, though it's difficult. It's right at around 10,000 rows, but if you write a complex enough calculations, you can get a render time of a couple seconds. Something with a lot of LEFT(),RIGHT(),CONTAINS(), and MID() should do the trick. Just make some absurdly complex string that's patched together from all sorts of fields... the goal here is actually slow-ish performance. After extracting, you will notice this loads almost instantly. While the magic of extracts and their inherent performance gains are partially to thank, the other boost is the simple fact that Tableau has materialized a column in your extract containing the output of your ridiculous string calculation. It no longer has to parse these strings on the fly to obtain the result, but merely looks in this column.


          The same is true of many calculations (not just strings), but only in the event that they are not aggregated. The second you put something like SUM([Sales]) in your calculation, it still behaves like you described, with 1 difference. It's no longer computing on row level data, but rather aggregating the data to the dimensions in your view - possibly [Category] in your example, and creating a sort of temporary table with this level of detail, then computing the measure at this same detail level. For instance, if you have only the dimension [Category] in your view, and a Profit Ratio Calculation which is SUM([Profit])/SUM([Sales]), this is calculated at the Category level - that is Tableau will calculate the total Profit for the Category in question, and the total Sales, then divide the former by the latter.


          If there are still any points of confusion, feel free to let me know, and I will see what I can clarify.

          1 of 1 people found this helpful