1 2 Previous Next 22 Replies Latest reply on Jun 8, 2011 2:05 PM by tobyerkson

    Issue with Top N: Chooses wrong last value

      In the attached workbook in the first tab there are two charts.  The top chart is supposed to show the Top 5 Branches for a selected Region that have the most Requests.  The bottom chart simply shows all the Branches for that Region.  As can be seen, the last item shown in the top chart is Singapore and, as can be seen in the bottom chart, that is not the correct value.  It should be Manila or Melbourne (those two have the same Total Requests).  If you go to the last sheet where this dashboard report resides and play with the Filter for "Branch (copy)" you'll see that Singapore is always present and ultimately ends up as #1 when the filter is set to Top 1.  ???


      What's also interesting is that if you click on the Region EMENA then only FOUR Branches display, not FIVE!  ????


      I've been taking the "View Data..." and examining it, dumping crosstab data into Excel, and I can't figure out why Tableau is doing this.  I'm totally baffled.  I'm hoping a fresh set of eyes can help out, maybe I'm missing something obvious since I'm too close to the data?  TIA!

        • 1. Re: Issue with Top N: Chooses wrong last value
          Joe Mako

          It is an order of operation in filtering. You want the Top N filter to run last, so one option is to make all the other filter pills context filters.

          • 2. Re: Issue with Top N: Chooses wrong last value
            Andy Cotgreave

            Nice tooltips on those worksheets, by the way!


            • 3. Re: Issue with Top N: Chooses wrong last value
              Joe Mako

              In addition, you can use a table calc as well, but it requires some special steps, as described at:

              The specified item was not found.


              I applied both methods in the attached workbook, and you can see in the other regions, where your original kept 5, is was not the correct 5.

              • 4. Re: Issue with Top N: Chooses wrong last value

                Joe (and Richard), I'm looking at the thread for The specified item was not found. and I still can't wrap my knoggin around the workings of the PREVIOUS() function.  I'm looking at the "table calc" sheet you provided and I just don't get it.

                • 5. Re: Issue with Top N: Chooses wrong last value

                  Sun of a gun...Joe, thank you soooo much!  I am obviously ignorant of Tableau's filtering system intricacies.  Can you -- or any one else who understands it -- explain in their own words 1) how a Context filter(s) works and 2) the order that filtering follows.  FOR EXAMPLE, (loosely speaking) we know that the order of mathematical operations in parenthesis occur first, then arithmetic operators, then Boolean (AND, OR, NOT) operators.


                  acotgreave, thank you.  I try to add little extras to help make the data more understandable within the limits of what is requested.  BTW, I used your blog post http://www.thedatastudio.co.uk/blog/the-data-studio-blog/andy-cotgreave/tableau-tooltips-conditional-formatting to help me with some formatting of the tooltips for another dashboard I built -- nothing extrodinary but my manager loved it and I gotta few "atta-boy"s outta it ;-)  Thanks man!

                  • 6. Re: Issue with Top N: Chooses wrong last value
                    Joe Mako

                    Simply, PREVIOUS() is a self referencing function, I think of it like a recursive function. It is like a LOOKUP([field],-1), but the [field] is the formula the PREVIOUS() function is a part of. As for the argument passed to the function, in this case 0, this is the value to use as previous when there is no previous. Does that help?

                    • 7. Re: Issue with Top N: Chooses wrong last value
                      Joe Mako

                      As for the order of filter operations, see The specified item was not found.

                      • 8. Re: Issue with Top N: Chooses wrong last value

                        I'll have to be honest Joe, I do not understand.  How does it "know" when to stop looping (exit the recursion)?  The more I look at LOOKUP and PREVIOUS_VALUE the more I'm unsure how they work in Tableau.

                        • 9. Re: Issue with Top N: Chooses wrong last value
                          Joe Mako

                          Partitioning is what tells the calculation to restart.


                          You can see the resources linked to at http://www.tableausoftware.com/about/blog/2011/03/its-table-calcs-thursday for the basics. Be aware that the exact method used for your situation is not covered in these resources, only in the thread linked to earlier.


                          You are also welcome to contact me and we can setup a time to talk about table calculations. joemako [at] gmail [dot] com

                          • 10. Re: Issue with Top N: Chooses wrong last value

                            Thanks Joe.  Regrettable answer on your question for the filtering thread :-(

                            • 11. Re: Issue with Top N: Chooses wrong last value
                              Richard Leeke

                              I'll have a quick go at explaining LOOKUP() and PREVIOUS_VALUE().


                              First some context (I'm sure you understand this, but I think it helps to be explicit).


                              When you refresh a view, the database returns a set of rows to Tableau, as appropriate for the dimensions and measures you have on all of the shelves.  The result set returned to Tableau contains all requested fields with the exception of table calculations (but including simple calculated fields, which are evaluated by the database).


                              For each table calculation, Tableau performs the following steps.


                              1) It divides up the returned rows into a set of "partitions", which are subsets of the rows returned.  This can be anywhere between a single partition covering all returned rows right down to one partition per row.  The rows are sorted as specified in the table calculation partitioning dialog.


                              2) It works through the rows of the partition, evaluating the result of the table calculation.  I'll come back to this and expand on what LOOKUP() and PREVIOUS_VALUE() are doing in a minute.


                              Note that as table calculations can reference the results of other table calculations, Tableau has to be smart about the order in which it evaluates them.  In particular, it is smart enough to notice if you have set up a circle of dependencies between your fields, and tells you as much.



                              In evaluating the results of a table calculation for a particular row, you can reference other fields in several ways.


                              a) If you just reference another field value, that lets you access the result of that field from the row that you are currently processing.


                              b) LOOKUP() lets you reference the results of an arbitrary expression (with some restrictions) in the current row (offset 0) or any other row in the partition (+ve or -ve offsets from the current row).  Note that the expression must be an aggregate expression - but the special aggregate function ATTR() can be useful here for getting individual values if appropriate).


                              c) PREVIOUS_VALUE() lets you access the result of the calculated field you are currently evaluating, from the previous row in the partition.  The fist row doesn't have a previous value, which is where the parameter comes in.  That lets you "bootstrap" the process by defining what number to start with.


                              An example probably helps for c).  You can define a running sum of sales, like this:


                              [RUNNING_SUM_SALES] = PREVIOUS_VALUE(0) + SUM([Sales])


                              So that evaluates just to the sum of sales for the very first row.  For the next row it adds the sum of sales for the second row to the sum from the first, and so on.


                              Does that help?

                              • 12. Re: Issue with Top N: Chooses wrong last value

                                Okay, the RUNNING_AVG() example on page 15 of this document Table Calculations in Tableau 6: A Primer, along with Richard's explanation, helped me (I think!) figure out the PREVIOUS_VALUE() a little bit.  Please see attached Excel document to let me know if I'm on the right track.


                                ...time passes...looking to generate a visual and step-by-step process...


                                Okay, I've added a sample .twbx to further illustrate (prove?) what I did in Excel.  When building the Calculated Field, FOO, I could not use [Sales] by itself, the editor wanted it to be an aggregated item e.g., SUM().  I'm wondering how Tableau knows at what level to apply the aggregation...  Anyway, when I dropped FOO onto the table it automagically turned it into a QTC (Quick Table Calculation).  Interesting.  So maybe PREVIOUS_VALUE() in a Calculated Field is a trigger for Tableau to make it into a QTC?  That'd make sense, so any function that falls under "Table Calculation" prompts this intelligence.  This then tells me that the calculation won't occur until the table is created and, basically, the FOO column is populated last in this example.


                                I can understand what Joe meant when he said, "...I think of it like a recursive function."  In order for Japan to get its value, it has to get the previous value, in this case, from the UK.  But for the UK to get its value it has to go back and get the USA.  So it's calling itself at each backstep until it hits the first row of the partition whereupon it puts the default parameter, zero, into the calculation.  So a good comprehension of recursion is helpful to figuring this out when you attack this from anywhere within the table.


                                On the flip side, if recursion ain't yer thing, then the solution must be attacked beginning from the first row:

                                - The USA is the first row so we use the parameter given, 0, and the sum of sales for the USA, which is 18.  Add the two numbers, 0 + 18, to get a total of 18.  Write down 18 for the FOO column.

                                - Next row, the UK.  Get the previous value we wrote down, 18, and add the sum of the Sales for the UK, which is 17, (thus 18 + 17) to get 35.  Write down 35 for the FOO column.

                                - Next row, Japan.  Get the previous value we wrote down, 35, and add the sum of the Sales for Japan, which is 10, (thus 35 + 10) to get 45.  Write down 45 for the FOO column.


                                That's what I gather is going on.  Correct?

                                • 13. Re: Issue with Top N: Chooses wrong last value
                                  Richard Leeke

                                  1) Correct.


                                  2) I'm not sure what you mean by a variable for the partition.


                                  But looking at your example I see what you are not understanding.


                                  The SUM() is not a sum over the partition, just the sum over the dimensions on the sheet - i.e. the value of an aggregate measure expression returned for that row.  I've added a second sheet to your spreadsheet which should make everything clear (I hope).

                                  • 14. Re: Issue with Top N: Chooses wrong last value
                                    Richard Leeke

                                    So we cross-posted.


                                    Your last posting is all spot-on.  The reason for the difference between the calculations you added and the sheet I added to the excel workbook is essentially that we were partitioning  and aggregating at a different level.  The sheet I added based on your initial sheet assumed each row was returned, and the calculation was partitioned by Location.


                                    In your subsequent posting you had aggregated down to the Location level (so only had one row per location returned to Tableau) and your partition definition is a single partition spanning all rows.

                                    1 2 Previous Next