3 Replies Latest reply on Aug 18, 2016 1:08 PM by Benjamin Greene

    Complicated Tableau Multi-Row Calculation

    Cole Wunderlich

      Hello all,

       

      I am having quite a lot of trouble with a calculation in tableau.  I have two classes in my data, each containing two different sets of data with about 700 sub groups per set.  Each sub group has at least two members.  You could call the classes A and B and the subgroups S1-S700.  Each sub group as at least two members, say M1 and M2, and each class contains two sets of data, say A1 A2 and B1 B2.

       

      The data is formatted with the sample names as column headers (so A1,A2,B1,B2).  The sub groups form the rows, with a identifier for each member (M1,M2,M3,M1,M2,M1,M2,M3,....).  So it looks something like this

      Group MemberTag A1 A2 B1 B2

         1      M1               #    #    #   #

         1      M2               #    #    #   #

         1      M3              #    #    #   #

         3      M1              #    #    #   #

         3      M2             #    #    #   #

         4      M1             #    #    #   #

         4      M2             #    #    #   #

      ......

       

      The calculation I am trying to make is comparing M1_A1 with M2_A2 in each sub group (I also want to make the same calculation for class B).  Specifically, I want to know what % M2_A2 comprises of the two values in question (ie. M2_A2/(M2_A2+M1_A1)*100).   I then want to display this percentage for each group as a bar chart.

       

      I initially tried using LOOKUP(<value>,1) but found that it did not produce accurate results because it was view dependent.  So when I only displayed one group on my graph it worked correctly, but when I filtered the graph so it only showed the calculation for M1 members, the calculations where all off (since the LOOKUP was returning the next M1 in the view and NOT the next entry (M2) in the source data).

      Here is my current code in psuedo form: LOOKUP(MIN([A2]),1)/(LOOKUP(min([A2]),1)+MIN([A1]))*100

       

      Any help would be greatly appreciated.  Ideally, I would like a calculation that is based on the global LOD (ie. source data order) that is not view dependent.

       

      Note, in the attached workbook,  the member tags are 12C, 13C_6, 13C_12  instead of M1, M2, M3.

        • 1. Re: Complicated Tableau Multi-Row Calculation
          Benjamin Greene

          Hey Cole, take a look at the M1_A1, M2_A2, M1_B1, M2_B2, A Solution and B Solution calculated fields I made here and let me know if this at least is headed in the direction you were hoping to go.

          1 of 1 people found this helpful
          • 2. Re: Complicated Tableau Multi-Row Calculation
            Cole Wunderlich

            Thanks, that did the trick!!!  Could you explain the logic behind what you did (and how you knew to do it)?  I am pretty new to tableau and doing more complex calculations has been consistently frustrating so far.

            • 3. Re: Complicated Tableau Multi-Row Calculation
              Benjamin Greene

              Glad it works! And I'd be happy to try to explain what I did. Get comfy and buckle up because I have enough spare time that I am going to provide a pretty comprehensive answer. Or you could just TL;DR your way to the bottom to get a (slightly more) specific answer, but maybe someone else will come across this post someday and find it useful.

               

              So the first thing to understand is which numbers Tableau looks at when you tell it to do a calculation, and where it writes that calculation to. A simple calculation like [A]/[B] is equivalent to making a new column in an Excel spreadsheet whose formula is =A1/B1. That is, every single row in your raw data will get a value for this calculated field and it will be the quotient of [A] and [B] in that row. When you drag this pill into your view, you have the choice to Sum, Average, Count, etc. all of these values within whatever dimensions you have specified in the view.

               

              An aggregate calculation like SUM([A])/SUM([B]) does not work like that. This difference is not immediately obvious (depending on how new you are to Tableau), but understanding it is vital. Rather than writing new values to each row in the data, a calculation like this will take the sum of all the [A] values within whatever dimensions you have specified in the view, and then divide that by the sum of all the [B] values in your dimensions. Notice that when you drag this pill into the view, it automatically has AGG in front of it and does not allow you to Sum, Average or Count, since its formula already is an aggregation.

               

              A firm grasp of everything up to now is 90% of the battle and will allow you to avoid running into those pesky "Can't mix aggregate and non-aggregate" errors, as well as the "Can't aggregate something that is already an aggregate calculation" errors.

               

              Next, we have table calculations, like WINDOW_SUM(SUM([A])/SUM([B])) or any of the grand total calculations. The contents of a table calculation must be aggregate calculations, as Tableau performs these calculations after other calculations and only references the data in the view. In this example, it would sum [A] within your view dimensions, divide that by the sum of [B] within your view dimensions and then sum all of the resulting quotients across all the dimensions.

               

              Finally, we have Level of Detail (LOD) expressions, which were new to Tableau 9. These are by far the most versatile (and awesome) calculations, because they allow you the most control over exactly how and at what specificity you want something to be calculated, AND they return values that can then be further aggregated within the dimensions in your view. To wrap my head around this, I like to think of LOD expressions as performing an aggregate calculation at a specified granularity, then writing that value as a new column into each row of the raw data, like a simple calculation would (this isn't technically true, but I find it's a useful way to think about it). What is important to remember here is that all rows who are grouped together at the specified granularity get the same value for an LOD expression.

               

              Which brings us to your question. You wanted to compare a value in one row of your raw data to a value in a different row of your raw data, which means simple and aggregate calculations won't work. Furthermore, the data in the view was not structured identically to the raw data, which means table calculations (like LOOKUP) won't work. Finally, (whether you realized you were on to something or not) you specifically said "I would like a calculation that is based on the global LOD (ie. source data order)," which was a dead giveaway that an LOD expression was the way to go.

               

              Since the end goal was to divide M1_A1 by M1_A1+M2_A2, we had to figure out a way to get both M1_A1 and M2_A2 into the same row. For M1_A1, I wanted Tableau to find the member tag 12C within each Group Number and return the corresponding A1 Average value. For M2_A2, I wanted Tableau to find the member tag 13C_6 within each Group Number and return the corresponding A2 Average value. That can be done with some simple IF statements:

               

              IF [Member Tag]="12C" THEN [A1 Average] END

              and

              IF [Member Tag]="13C_6" THEN [A2 Average] END

               

              However, this alone will not get M1_A1 and M2_A2 into the same row. This would only write M1_A1 into the 12C rows and be null everywhere else and only write M2_A2 into the 13C_6 rows and be null everywhere else. But since we know that Group Number was the granularity we were focusing on, and we know that Tableau "writes" the same answer for an LOD expression to every row within the specified granularity, we could do the following:

               

              {FIXED [Group Number] : MIN(IF [Member Tag]="12C" THEN [A1 Average] END)}

              and

              {FIXED [Group Number] : MIN(IF [Member Tag]="13C_6" THEN [A2 Average] END)}

              (For details on the syntax check out Level of Detail Expressions.)

               

              Like I said, the contents of LOD expressions need to be aggregate, so I had to throw a MIN out in front of the IF statement. As it turns out, since there was only one 12C and 13C_6 per group, it actually doesn't really matter at all what kind of aggregate function I used.

               

              Once the M1_A1 and M2_A2 values were calculated, all I had to do was the simple [M2_A2]/([M2_A2]+[M1_A1]) calculation (and format the number as a percentage) to arrive at the solution.

               

              Sorry if that was way more detail than you wanted, but I figured I might as well fully flesh out my thought process and hope you found at least some of it helpful. Also, to any Tableau employees or hardcore purists who may read this, I'm sorry if my explanation of any of this is non-canon or misleading.

              1 of 1 people found this helpful