8 Replies Latest reply on Oct 29, 2015 8:31 AM by Rody Zakovich

    INCLUDE/EXCLUDE LoDs are not vizLoD aware when used in a Computed Set?

    Rody Zakovich

      Hey Community!

       

      *Edit* I changed the title of the Question to more closely match the context of the thread. The previous "Using LoDs in SETS Don't Actually Do Anything?"....was not my actual question. Hopefully this will help others who are searching for a similar question.

       

      Sorry I haven't been answering away like normal.....busy week coming back from the conference/I have been working on finally getting a blog written.

       

      As a part of this blog, I am trying to really breakdown Tableau. The first part of this is trying to understand Tableau as a series of Levels. We interact with the values at each level differently.

       

      1. The Main Datasourse (Your DB, Excel Sheet, etc)

      2. The Effective Datasource (Your DB/Excel Sheet + Row Level Calcs [Not including LoDs] )

           - If you are using an EXTRACT (Optimized) the Columns created with Row Level Calcs are passed back to the DB as a part of          the EXTRACT Query (This is why LoDs are not included in this layer).

      3. The Tableau Datasource (Your DB/Excel Sheet + Workbook Specific Fields [GROUPS] + Row Level Calcs + FIXED LoDs [FIXED are not vizLoD Dependent])

           - INCLUDE/EXCLUDE LoDs are different as the live in the both The Tableau Datasource AND the Computed Datasource, i.e they      must be computed by the vizLoD, but the values still exist at a Row Level

      4. The Computed Datasource (Aggregations [This would include BLENDS with defined relationships] and Window Calculations)

           - The values generated are vizLoD dependent. The value generated from SUM([Sales]) is dependent on how the viz is      partitioned.

       

      Off in another world but still very much a factor in the Computed Datasource.....Tableau's Range Awareness - The Values generated in accordance with desensification/padding/etc

       

      Where I am stuck.....where do computed SETS live?

       

      When using Computed SETS, LoD functions don't actually do anything. Using SUM({ INCLUDE : SUM([Sales]) }) Will result in the same SET as SUM([Sales]).

       

      10-28-2015 10-49-26 AM.png

       

      10-28-2015 10-48-51 AM.png

       

      My initial thought is COMPUTED SETS are not vizLoD Dependent. They are calculated PRIOR to the vizLoD. Which is why a SET computed from SUM({ INCLUDE : SUM([Sales]) }) is the same as SUM([Sales]). The only thing that can affect a COMPUTED SET are CONTEXT Filters (However, I am still defining how CONTEXT Filters are different than other filters and fields in the vizLoD). That said, even when we use CONTEXT Filters, the LoD and Regular Aggregate still compute the same sets, they are just created based on the values within CONTEXT.

       

      If this is true, then Computed SETS' live within the Tableau Datasource, and not the Computed Datasource (As I previously thought). This makes sense since we can use SETS Row Level...i.e

       

      IF [includeCustSet] THEN 1 ELSE 0 END

       

      10-28-2015 11-18-14 AM.png

       

      Further clarification, or additional details on this subject would be highly appreciated!

       

      Jonathan Drummey or Simon Runc do you have some thoughts on this subject?

       

      Regards,

      Rody

        • 1. Re: Using LoDs in Computed SETS doesn't actually do anything?
          Mark Bradbourne

          I need to go back and check, but I could have sworn in one of the session they used LODs in a Set during TC15. If I find it I'll post the info.

           

          Nope, I was wrong... everything blended together.

          • 2. Re: Using LoDs in Computed SETS doesn't actually do anything?
            Alfredo Pirrone

            Rody

             

            Maybe you developed a bad example to analyze your case. In what context SUM({ INCLUDE : SUM([Sales]) }) is different than SUM([Sales])?

            Adding is an associative operation. No matter how you aggregate your leaf data values, the total sum will always be the same.

             

            I propose you try a different formula: say

            AVG(Sales) vs SUM({ INCLUDE [Category]: AVG([Sales]) })

             

            This formula may not have much logical sense, but will do the work to test functionality.

             

            Using your same data, this is what you get:

            Screen Shot 2015-10-28 at 11.16.56 PM.png

            … and reproducing your image with these new sets:

            Screen Shot 2015-10-28 at 11.41.28 PM.png

            As you can see, using the LOD insight a SET actually seems to work fine.

             

            Your statement: "Using SUM({ INCLUDE : SUM([Sales]) }) Will result in the same SET as SUM([Sales])." recognizes the fact that these expressions are equivalent.

             

            Hope this helps.

            Yours,

            Alfredo

            1 of 1 people found this helpful
            • 3. Re: Using LoDs in Computed SETS doesn't actually do anything?
              Rody Zakovich

              Hello Alfredo,

               

              Thank you for your response, as it did give me more context on what I am seeing.

               

              Alfredo Pirrone wrote:


              In what context SUM({ INCLUDE : SUM([Sales]) }) is different than SUM([Sales])?

               

              Your statement: "Using SUM({ INCLUDE : SUM([Sales]) }) Will result in the same SET as SUM([Sales])." recognizes the fact that these expressions are equivalent.

               

               

              Essentially....nothing, other than what Tableau is doing behind the scenes. When nothing is addressed in an INCLUDE/EXCLUDE statement, you are going to get the same value within the viz. However, those values are returned back to The Tableau Datasource, Row Level. I.E I can use an { INCLUDE : } Statement to create a row level aggregate that is vizLoD aware.

               

              Like Jonathan Drummey demonstrated here

               

              LOD Expression Remix – Finding a Dimension at a Lower Level | Drawing with Numbers

               

              "This uses a little trick in LOD calcs where by not including any specific dimensions we can force an aggregation at the level of detail of the viz (i.e. State) and still return that as a record-level result"

               

              Essentially what I was trying to see is if INCLUDE/EXCLUDE LoDs inside of SETS are vizLoD aware, which they appear not to be.


              Alfredo Pirrone wrote:

               

              I propose you try a different formula: say

              AVG(Sales) vs SUM({ INCLUDE [Category]: AVG([Sales]) })

               

              This, example does give me more insight on how to use LoDs in Sets, but not in the way I previously thought. This example, though, is bound to not create the same set, because these two equations do not result the same.

               

              10-29-2015 7-53-42 AM.png

               

              Which I believe was your point.

               

              When using COMPUTED SETS....

               

              SUM( { INCLUDE [Category] : AVG([Sales]) }) is essentially the same as doing  ATTR([SUM({ FIXED [Customer Name]], [Category]] : AVG([Sales]]) })])  regardless of what other fields are in the vizLoD.

               

              This is because a COMPUTED SET does not take into consideration the vizLoD, but it does take into consideration the fields being ADDRESSED within the INCLUDE/EXCLUDE Statement. So the fields addressed in the INCLUDE/EXCLUDE statement are aggregated to the SET Level ([Customer Name]) regardless of any other field in the vizLoD. Which is why it is equivalent to using a FIXED LoD where we address both [Customer Name]  and [Category]

               

              10-29-2015 8-07-31 AM.png

               

              10-29-2015 8-37-50 AM.png

               

              Adding a lower level dimension......

               

              10-29-2015 8-38-33 AM.png

               

              If SETS were vizLoD aware, the In/Out would recalculate to OUT, because SUM( { INCLUDE [Category] : AVG([Sales])}) < 800 would be $886.2 instead of $508.5.

               

              I have no idea why I thought SETS would recompute using INCLUDE/EXCLUDE LoDs, but it is good to know moving forward.

               

              Essentially, computed SETS happen at the Same level as FIXED LoDs. (Which makes sense since CONTEXT filters affect both the same, but regular filters don't). They leave within the Tableau Datasource, not the Computed Datasource as I had previously assumed.

               

              Anyway, thank you again for your examples! They were very beneficial in helping clarify things.

               

              If anyone else has more examples or if my definitions are completely off, please share!

               

              Best regards,

              Rody

              • 4. Re: Using LoDs in Computed SETS doesn't actually do anything?
                Jonathan Drummey

                Hi Rody,

                 

                You wrote:

                Essentially, computed SETS happen at the Same level as FIXED LoDs. (Which makes sense since CONTEXT filters affect both the same, but regular filters don't). They leave within the Tableau Datasource, not the Computed Datasource as I had previously assumed.

                 

                I haven't tested all the edge cases I can think of but as a general rule this is the case. Top & Conditional filters, computed Sets and FIXED LODs are typically (but not always) computed as subqueries that are inner joined back to the main query. The GROUP BY of the subquery is the dimension used for the Set or the dimension declaration of the FIXED LOD expression, and the only WHERE clauses that affect them are what's coming from context filters & data source filters (and of course extract filters).

                 

                I like your description of data "sources", I've been using a simpler terminology but you rightly point out that LODs make things more complicated. However, I think of it a little differently everything because things like Sets and LODs are returning record-level results that can be used in computing extract filters, etc. so it's hard to put them at a single "place" in the order of operations. In addition, Sets can be used inside LODs and vice versa to make life even more interesting.

                 

                The way I think of it is that Tableau can be computing at a number of grains of data essentially "all at once" from the user point of view, then the order of operations affects when X is computed relative to Y.

                 

                - raw data source - The original raw data (prior to any joins)

                - Tableau data source - includes joins, extract filters, data source filters, record level calculations including Sets and LODs. Essentially this is what we see in the Dimensions & Measures window or when we use the view data option (which will also calculate aggregate measures at a record level, that's not always meaningful).

                - Context filters (which can included computed Sets, LODs, Top & Conditional filters, etc.) create a new context of the Tableau data source, besides reducing the data set this can can effectively change the grain of the data, relationships between fields, and relative sparseness/completeness of the domains.

                - Top & conditional filters each have their own grain.

                - Computed Sets & LODs each have their own grain.

                - Each data blend can introduce an additional grain for computation that is generally invisible to the user except when we're using a non-additive aggregate and the grain of the blend is finer than the grain of the view or some other situations when Tableau will change COUNTD/MEDIAN/etc. pill to red.

                - Since densification occurs before blending the domains of blending dimension(s) can radically change.

                - View/Marks Card record-level - what the data for a view (or a particular Marks Card) "looks like" at a record level, so this is after any dimension filters or other filters higher up in the priority list have been applied

                - View/Marks Card aggregate (primary) - What the view (or a particular Marks Card) displays after aggregation, including aggregate filters. This is only for measures that can be computed entirely in the primary source and when blending on a finer grain than the view is not changing the query that Tableau is issuing.

                - Since densification occurs before blending the domains of blending dimension(s) can radically change.

                - View/Marks Card aggregate (blending) - What the view (or a particular Marks Card) displays after blending, including aggregate filters, blended calcs, replication of data from a blend when the blend is at a coarser grain than the view, etc.

                - Grand totals

                - Table calcs

                - Trend lines

                - Reference lines

                 

                Here's a crazy example: A single LOD expression could be separately computed 6 separate times for one view:

                 

                - inside an extract filter

                - as a data source filter

                - as a context filter

                - to make the list of a quick filter with only relevant values turned on

                - to build the view

                - in the computation for a grand total

                 

                If the view is using a DB2 data blend that could add an additional couple of computations as well.

                 

                Jonathan

                • 5. Re: Using LoDs in Computed SETS doesn't actually do anything?
                  Alexander Mou

                  This is discussion of the order of operations all over again.

                   

                  Can we take LOD out of the main flow? It is a subquery anyway. This way we

                  can have a better view of the main flow. I tend to isolate LOD as a special

                  parameter.

                   

                  On Thursday, October 29, 2015, Jonathan Drummey <

                  2 of 2 people found this helpful
                  • 6. Re: Using LoDs in Computed SETS doesn't actually do anything?
                    Alexander Mou

                    To paraphrase, LOD is a side calculation for the service of the main flow.

                     

                    On Thursday, October 29, 2015, Alexander Mou <alexandermou2000@gmail.com>

                    1 of 1 people found this helpful
                    • 7. Re: Using LoDs in Computed SETS doesn't actually do anything?
                      Jonathan Drummey

                      Hi Alexander,

                       

                      I agree that this thread is very much addressing the order of ops and it's not quite the same thing as the other thread because it's also bringing in the grain that each computation is happening at.

                       

                      As one bit of extra info, LODs are not always subqueries. What I've seen in the logs is that if the vizLOD matches the grain of the LOD then Tableau is smart enough to not use a subquery and will instead compute the LOD as part of the "main" query.

                       

                      Taking a step back, I believe what those of us who are into this kind of thing are trying to do is create mental models that match enough of how and what Tableau does to be useful for us (and maybe others). I'm reminded of the quote from George Box: "All models are wrong. Some are useful." So a model of the order of operations that keeps LODs off to the side may be perfect for some.

                       

                      I've been thinking about how to document this for awhile, I've got a giant graphic I drafted (before LODs and other learning that broke it) that documents over 100 separate steps in multiple paths through Tableau's pipeline. I'm amazed at how much Tableau can do in that instant between dropping a pill and seeing the view refresh. Right now my own take on modeling the order of operations is to use three levels of detail:

                       

                      - Highest level - it's about 10 steps with extract filters, data source filters, context filters, sets & fixed LODs & top & conditional filters, dimension filters, measures, table calc filters, hide, reference lines.

                      - Detailed - breaks that down more to particularly add details about densification, data blending, table calcs & grand totals, etc.

                      - Tableau geek heaven - the 100+ steps that includes mapping, z-order of the display, etc.

                       

                      That's what works for me, that might not work for others.

                       

                      Jonathan

                      1 of 1 people found this helpful
                      • 8. Re: Using LoDs in Computed SETS doesn't actually do anything?
                        Rody Zakovich

                        Hey Jonathan and Alexander,

                         

                        Thank you for providing your input! It is always very much appreciated.

                         

                        Alexander, to you point

                             This is very similar to the Order of Operations, but as Jonathan stated, I am trying to decipher at which step computations are being evaluated within it. I think of the Order of Operations as a separate entity (Like a universal law) as I discuss here...Re: How Do You See Tableau's Canvas?

                         

                        This model, even though I only created it a few weeks ago, has already changed. But the fundamentals behind it are still the same.

                         

                        As Tableau adds new features (Such as LoDs), the model becomes ever more complex. I think Kettan might say this is happening because Tableau is moving away from its' "Butter Zone". I am interested to see IF/HOW Filters (Order of Operations) will work when VIZ in Tooltip gets released. For a later discussion (When it is released)....is the VIZ in Tooltip apart of the Worksheet object? Dashboard Object? Or Something completely separate?

                         

                        Jonathan,

                         

                        Jonathan Drummey wrote:

                         

                        Taking a step back, I believe what those of us who are into this kind of thing are trying to do is create mental models that match enough of how and what Tableau does to be useful for us (and maybe others). I'm reminded of the quote from George Box: "All models are wrong. Some are useful." So a model of the order of operations that keeps LODs off to the side may be perfect for some.

                         

                        I couldn't agree more with this statement! I, by no means, think that my mental model of Tableau IS the correct model (Because all models are wrong). I am just trying to clearly define my model so that I can share it with others. I don't expect it to help everyone, but hopefully it will help some, which is my aim.

                         

                        All this being said, as I breakdown Tableau lower and lower, I am starting to get to this point....

                         

                        Jonathan Drummey wrote:

                         

                        - Tableau geek heaven - the 100+ steps that includes mapping, z-order of the display, etc.

                         

                        Which I personally LOVE! But I am trying to get it a little simpler so that I can share it with others.

                         

                        Sorry for the tangent, but back to the point....

                         

                        Jonathan Drummey wrote:

                         

                        Top & Conditional filters, computed Sets and FIXED LODs are typically (but not always) computed as subqueries that are inner joined back to the main query. The GROUP BY of the subquery is the dimension used for the Set or the dimension declaration of the FIXED LOD expression, and the only WHERE clauses that affect them are what's coming from context filters & data source filters (and of course extract filters).

                         

                        It does appear that  INCLUDE/EXCLUDE LoD within a computed set, OR Top & Conditional Filters are not vizLoD aware (So NCLUDE/EXCLUDE does not ALWAYS include the vizLoD). They only compute based on the fields in the dimension declaration, and of course, the field that we are created the SET on, i.e. from the example above [Customer Name].

                         

                        More interestingly is using a SET within an LoD....Which is going to lead me down a rabbit hole of something like this...just so I can see what happens.

                         

                        SET 2 = SUM({INCLUDE [Set 1] : AVG([measure])}) ------on [Customer]

                        ------

                        SET 1 = SUM({INCLUDE [dim] : SUM([measure])}) --------on [Customer]

                         

                        Thank you again everyone! This has been very helpful.

                         

                        Regards,

                        Rody