5 Replies Latest reply on Jun 12, 2012 9:14 AM by Jonathan Drummey

    calculate inverse sequence number for filtering

    Michael Platzer

      Let's say we collect lots of ordered data, but want to report only on the latest x items within each group. E.g. what is the average value of the last 3 records of each group for the following data?

       

      GroupTimeValue
      A01/06/2012 10:5035
      A01/06/2012 11:389
      A01/06/2012 12:121
      A01/06/2012 12:3848
      A01/06/2012 17:0883
      B01/06/2012 10:5075
      B01/06/2012 11:3442
      B01/06/2012 12:0436
      B01/06/2012 13:509

       

      This can be solved by pre-calculating inverse sequence numbers within the data source, and then filter to InvSeq<=3:

       

      GroupTimeValueInvSeq
      A01/06/2012 10:50355
      A01/06/2012 11:3894
      A01/06/2012 12:1213
      A01/06/2012 12:38482
      A01/06/2012 17:08831
      B01/06/2012 10:50754
      B01/06/2012 11:34423
      B01/06/2012 12:04362
      B01/06/2012 13:5091

       

      However, this approach prevents us to use incremental extracts, as with each new data point the InvSeq would need to be re-calculated for the whole data set. With the data volumes at hand, we need to use incremental extracts, that's why I am looking for an (efficient) way to have these inverse sequence numbers calculated in Tableau.

       

      thanks for any advice, my search on the forum so far hasn't revealed anything yet,

      best

        michael

        • 1. Re: calculate inverse sequence number for filtering
          sirajmandayippurath

          Hi Michael,

           

          Have you tried using "SIZE() - INDEX() + 1" to work out a reverse index?

           

          Also, are you likely to have datetime instances which are valid for Group A and not for Group B like the example dataset above?

           

          Regards

          Siraj

          • 2. Re: calculate inverse sequence number for filtering
            Michael Platzer

            Unfortunately that wouldn't work, since I want to have in the end the aggregted table / viz (without individual records on any shelf). This table would be the target output:

            GroupAVG(Value)
            A44
            B29

             

            And yes, we've got different time instances for each group. Note, the above tables are very stripped down versions of this problem: We collect user feedback for our services, and want to (efficiently) report the average given rating for the last 400 responses given for each service & each country.

            • 3. Re: calculate inverse sequence number for filtering
              Jonathan Drummey

              Hi Michael,

               

              Attached is a sample workbook using the data you provided. It makes use of two table calculations. Here's what I did:

               

              Created a Time String with the formula STR([Time]). This is to get around Tableau applying too much padding to dates when using table calculations (much of the time, this is really useful, but not here). The field is set up with a default sort of Time/Min/Ascending, so it will properly sort.

               

              Created a calculated field called Index with the formula INDEX().

               

              Created the Avg of last 3 results calculated field with the following formula:

              IF FIRST()==0 THEN

                  WINDOW_AVG(IF [Index] <= 3 THEN SUM([Value]) END, 0, IIF(FIRST()==0,LAST(),0))

              END

               

              The IF FIRST()... and IIF(FIRST()... statements use a technique developed by Richard Leeke to speed up table calculations and reduce overlapping text by reducing the number of times they are performed. The IF [Index] inside the WINDOW_AVG makes sure that we're only performing the average across the 3 rows.

               

              You then create a view by having Group on Rows, Time String on the Level of Detail, and the Avg of last 3 results on the Text Shelf. The Avg of last 3 results then has very specific Compute using settings (that you get to by right-clicking on the green pill and choosing Edit Table Calculation...) since it is a nested table calcuation:

               

              Avg of last 3 results - Compute Using Time String.

              Index - Compute Using Advanced... with Time String on the right-hand (Compute Using) window, and the Sort set to Time/Min/Descending.

              Screen Shot 2012-06-08 at 11.42.18 AM.PNG

               

              Does this work for you?

               

              Jonathan

              1 of 1 people found this helpful
              • 4. Re: calculate inverse sequence number for filtering
                Michael Platzer

                Thanks Jonathan, that was a real helpful answer. Thanks to your reply, I learned about the trick to split data further by putting variables on 'level of detail' without adding to the view, plus about the speed optimization of redundant table calculations. Many thanks for taking time for trying to help me.

                 

                However, in most of our use cases, this approach is not feasible, since performance degrades too badly. E.g. I've got 650k ratings across 1250 products, and want to calculate average ratings for 'last 400' records for each of the 1250 products. If the inverse rank is pre-calculated in the data source, and as such simply part of the data extract, the calculation of these 1250 scores takes less than 1 second. When using your approach, I need the record ID to be on Level of Details Shelf, thus have 650k marks and calculation takes over 90seconds.

                 

                I guess, what I was after, was some sort of window-functions which Tableau can pre-calculate when pulling the data extract, and that is then used as a filter. Similar to these rank/partition/over constructs in SQL:

                   RANK() OVER (PARTITION BY product ORDER BY datetime DESC)

                Tableau would then still need to update this calculation for all records, when new data comes in, but we could be using incremental refreshs instead.

                 

                To me it seems that this is a missing feature of Tableau.

                 

                But many thanks Jonathan for your help! big kudos!!

                • 5. Re: calculate inverse sequence number for filtering
                  Jonathan Drummey

                  Hi Michael,

                   

                  Yes, table calcs are slow. Part of that is just needing to move lots of data off the disk (and possibly over the network) into memory to run a bunch of calculations that can't be sped up that much, part of that is that table calcs could be optimized. If you look up ranking and percentile rank in particular on the forums there are a number of threads about ways to integrate table calcs and SQL.

                   

                  Given your data volumes, doing the "heavy lifting" in the DB makes sense to me. As for your idea of updating all records in the extract for some calcs, while incrementally adding data, I think it's an interesting one, can you add it to the Ideas section? One thing to know, though, is that Tableau does not materialize the results of table calculations in the extract, only in the view. You can see this in action  if you create a table calc and do View Data on the data source, any table calcs won't be there. So I imagine this kind of feature could be a bigger deal to implement given that the only way to do rank calcs totally within Tableau is to use table calcs, so there'd also need to be a way to usefully materialize table calcs in the extract.

                   

                  Jonathan