5 Replies Latest reply on Nov 6, 2017 1:37 PM by Jonathan Drummey

    Difference in using Running_Sum vs Sum + Previous Value(0)

    John Sobczak

      I just realized that both of these expressions will give you a running sum.  Both are also Table calculations.  Can someone explain the differences and pros and cons in each approach?

        • 1. Re: Difference in using Running_Sum vs Sum + Previous Value(0)
          Yuriy Fal

          Hi John,

           

          The results (of the calcs) look the same,

          and the comparative performance is similar.

           

          Here are differences (as of Tableau version 10.0):

           

          1) Reported number of Marks on a view differs slightly.

          Here are screenshots of the same Pill arrangement:

           

          RUNNING_SUM()

          RUNNING_SUM().png

           

          ZN(AGG)+PREVIOUS_VALUE(0)

          ZN(AGG)+PREVIOUS_VALUE(0).png

           

          Jonathan Drummey wrote on this recently in his blog:

          When 576 = 567 = 528 = 456: Counting Marks | Drawing with Numbers

           

          2) The second Computing Layout transaction is evidently faster (by 3 second margin)

          in the case of ZN(AGG)+PREVIOUS_VALUE(0) calc as compare to RUNNING_SUM() one.

           

          So the latter (Sheet 2) is rendered faster than the former (Sheet 1) by 4 second margin.

           

          Please find the attached Performance Recording workbook, as well as the original one (both v10).

           

          Caution: views in Book1 are rendered at about 1 minute each (on a descent Macbook Pro).

          Of course, your mileage may vary.

           

          Yours,

          Yuri

          1 of 1 people found this helpful
          • 2. Re: Difference in using Running_Sum vs Sum + Previous Value(0)
            Jonathan Drummey

            Cool testing, Yuri! I'm guessing that the difference in the marks count is due to the ZN() populating some padded marks with 0.

             

            I'm not sure I'd read too much into the 3 second difference unless there were multiple tests under the exact same conditions, that's ~5% difference and that could be due to caching, some other process on the machine that is temporarily grabbing memory/CPU, etc. In my past testing I've regularly found differences of near that magnitude from run to run on longer runs.

             

            John, you'd asked about the differences and pros and cons. Personally I prefer using RUNNING_SUM() for the following reasons:

             

            a) Running Sum is available as a quick table calculation in 3 clicks, vs. several clicks and a bunch of typing for the formula.

            b) I'm very concerned about the Tableau skill level required to maintain what I build. PREVIOUS_VALUE() is a difficult function for most people to comprehend since it works differently from every other function in Tableau (it's Tableau's only self-referencing function).

            c) As an addition to argument b), ZN([agg]) + PREVIOUS_VALUE(0) requires knowing to give PREVIOUS_VALUE() a dummy argument of the proper data type and wrapping the aggregate in a ZN() to stop any Nulls the might creep in from data densification.

            d) RUNNING_SUM() makes clear what the calculation is doing in the function name whereas we need to parse out what the PREVIOUS_VALUE() version is doing.

            e) Given all of the above, even if the 3 second's performance increase for using PREVIOUS_VALUE() that was noted holds that's 3 seconds for 7.3 *million* marks = 0.0000004 seconds per mark in extra rendering time for using RUNNING_SUM()...I think the other advantages are worth the penalty.

             

            FYI, there's also a third option to do a running sum:

             

            WINDOW_SUM([agg], FIRST(), 0)

             

            Since FIRST() provides an offset to the first address aka row in the partition based on the current address then the results of the WINDOW_SUM() are changing for each mark.

             

            Jonathan

            3 of 3 people found this helpful
            • 3. Re: Difference in using Running_Sum vs Sum + Previous Value(0)
              Yuriy Fal

              Testing the 3rd option -- 

              WINDOW_SUM(AGG,FIRST(),0) --

              60 minutes and counting :-)

              Looks like Tableau is doing multi-pass here.

               

              So to conclude, both RUNNING_SUM(SUM(AGG))

              and ZN(SUM(AGG)) + PREVIOUS_VALUE(0)

              are equally performant calcs (single-pass ones).

               

              Use at your own disposal.

               

              Yours,

              Yuri

              3 of 3 people found this helpful
              • 4. Re: Difference in using Running_Sum vs Sum + Previous Value(0)
                John Sobczak

                Jonathan Drummey & Yuri Fal  you guys are awesome.  Thanks for the detailed insight!

                • 5. Re: Difference in using Running_Sum vs Sum + Previous Value(0)
                  Jonathan Drummey

                  Returning to an old thread...yes, WINDOW_SUM(AGG, FIRST(), 0) is slow due to being multi-pass. FIRST() is returning a different result for every mark and therefore the WINDOW_SUM() is being recomputed for each mark.