5 Replies Latest reply on Aug 4, 2016 3:49 PM by Stoyko Kostov

    Level of Detail in Top N filters

    Nick Zhao

      Hi,

       

      I am very new to Level of Detail related concepts.

       

      Here are some of the questions that have been bothering me. (The following examples use the sample superstore database.)

       

      1. In the first worksheet, I am drawing a text table, with Category and Subcategory as Row Dimensions. And my Text Label is showing SUM(sales). Based on what I've learnt from Tableau OnlineHelp, the level of detail in this view should be Category and Subcategory. Now, I drag a filter on Customer Name to the filter shelf. The filter is Top 10 by SUM(sales). My question is: what is this SUM(sales)'s level of detail? I would assume that it's not Category and Subcategory. Is it something like: {fixed [Customer Name]: SUM(sales)}? Or is it first partitioning to Category and Subcategory, then aggregating to Customer Name level?

       

      2. For the kind of Top 10 filters described in 1, (again from what I've learnt from online help), since it's a TOP N filter, it should gets executed before LOD expressions that involve include and exclude. Am I right?

       

      3. The second worksheet is very much like the first one. The only difference is: I built a calculated field "LOD sales" with the value {include: SUM(sales)}. Now instead of filtering Customer Name using Top 10 by SUM(sales), I am now filtering Customer Name using Top 10 by SUM(LOD sales). Here is a part that confuses me. On one hand, this is a TOP N filter. So it should be executed before any other include/exclude LOD expressions. On the other hand, the filter relies on an include LOD expression. So there is no way it can get executed before this particular LOD expression, namely "LOD sales", gets evaluated. So which one is it?

       

      I am attaching the workbook. Thanks!

       

      Nick

        • 1. Re: Level of Detail in Top N filters
          Stoyko Kostov

          Hello Nick,

           

          You can do the following experiments to get more insights on your questions.

           

          1. Change Top 10 to Top 1. Drag Customer Name to Rows. You will see one customer who has the highest total sales, but not the highest sales in some of the categories.

          2. Drag both Sales and LOD Sales to Rows. Remove the filter. You will see that they are identical. The reason for this is that there is extra aggregation (SUM) on top of the partition when you move LOD Sales to Rows, meaning that the end result will be the same regardless of what you use in your filter - LOD sales or Sales.

          3. If you change your LOD Sales definition to something that will be different from Sales - e.g. {FIXED [Category]: SUM([Sales])} - and still use Top 1, you will see that you get a different customer than the one you get if you use Sales.

          4. If you change your LOD Sales definition to include yet another field - e.g. {FIXED [Category], [Sub-Category]: SUM([Sales])} - and still use Top 1, you will see yet another different customer.

          5. Try experimenting with different aggregations, e.g. AVG, to see how your results change.

           

          So, I would say the conclusion is the following:

          1. If you use Sales in your filter, this is identical to INCLUDE LOD, and it doesn't matter which one comes first, as the end result is the same.

          2. If you use a calculated field as your filter, and it is LOD, partitioning will come first.

           

          Let me know if you are still confused, and why, and I'll try to help.

          • 2. Re: Level of Detail in Top N filters
            Nick Zhao

            Hi Stoyko,

             

            First of all, many thanks for your help!

             

            But frankly, I am still very much confused. I am surprised that Tableau doesn't have how it treats the aggregation in Top N filters (have the same level of detail as the view, or fixed, or include) documented somewhere. Even testing out this very simple scenario is stressful enough and the test only confirms what you guessed. Unless you run every single possibility, there is no way to say other guesses (or the possibility you haven't thought of) are wrong!

             

            In fact, shortly after I posted this thread, I ran into a problem involving this. I had a worksheet that involves heavily on include and exclude LOD expressions. Everything worked flawlessly. The chart was showing what I wanted it to show. UNTIL, when I use the same calculated field (working flawlessly to give the graph I desired) as a criterion of Top 10, things went terribly wrong. It's not picking TOP 10! I've tried other methods, such as using index(). But index() was also producing the wrong result. I duplicated the sheet in question as a cross tab sheet. I was looking at a row sitting on the second row, yet index() is saying 53! This had me gone completely crazy. The final FIX was literally to change include and exclude LOD expressions to fixed LOD expression. This was when I realized, probably, TOP N filter is using FIXED LOD expression. BUT, again, while my hypothesis that TOP N filter is using FIXED LOD expression does offer a good explanation to the error in the first place and why it got fixed after I change LOD expressions to FIXED, there is no way to conclude from logic that this is indeed the case. (This is like hypothesizing there is no fish in the water, and you go outside, fish for an hour, find no fish, and thus concluding there is no fish in the water because what you tested collaborated with your hypothesis.)

             

            As for my second and third question, note that my filter is always about a simple plain old dimension (Customer Name in this case). The only thing that is changing, is the criterion for Top 10. Sometimes, it uses other measures; sometimes, it uses include/exclude LOD expressions; sometimes, it uses fixed LOD expressions. I know in some cases it doesn't matter in which order operations are executed. In fact, I have yet to come up with an example that this could matter. BUT, if I do run into a complex scenario like this, I would very much like it if I know it inside out, rather than test  a very complex scenario against a big database on spot.

             

            If my hypothesis that TOP N filter uses fixed LOD expressions is correct, then my second and third question will be solved.

             

            Many thanks again,

             

            Nick

            • 3. Re: Level of Detail in Top N filters
              Stoyko Kostov

              Hi Nick,

               

              Please check this KB article:

               

              Finding the Top N Within a Category | Tableau Software

               

              Quoting the article:

               

              "This type of filter is computed across the entire data source. You can see that the same 5 categories are shown within each region because those are the top 5 selling categories sold across all regions. Instead of computing across the entire data source, you can calculate the Top N within a category by combining fields (i.e., creating a set) and then manually filtering the values. "

               

              The way I understand this is the following - top N filters work on the entire data source, and if you want them to work on separate categories, the article describes what to do.

               

              To conclude - Top N Customers using SUM(Sales) would group by Customer only (not by categories or sub-categories). If you change the aggregation in your top N definition to use LOD, that LOD will be used.

               

              I think this does equate to FIXED LOD indeed, by definition - FIXED LODs ignore all dimensions except the one specified. Top N does precisely that - groups by Customer Name on the entire data set (thus, ignoring other dimensions).

              • 4. Re: Level of Detail in Top N filters
                Nick Zhao

                Hi Stoyko,

                 

                Many thanks for your thoughts and it is a relief to see that you agree this equates to FIXED LOD. (I feel that the world makes sense again.)

                 

                I see that you are an employee of Tableau's. Can you confirm that this is indeed how Top N filters are implemented in Tableau?

                 

                Best,

                 

                Nick

                • 5. Re: Level of Detail in Top N filters
                  Stoyko Kostov

                  I can confirm that they both reduce to grouping by and aggregating the same dimensions, resp. measures.