1 2 Previous Next 19 Replies Latest reply on Aug 10, 2015 10:43 AM by jonathan.o'grady.0 Go to original post
      • 15. Re: Matrix of product lift
        jonathan.o'grady.0

        Hi Jonathan,

         

         

        The database is a Windows SQL Server.

         

         

        I have about 250,000 receipts across six venues broken down by about 90 categories and subcategories.

         

         

        I'm quite happy to simply compare the lift of one category versus another, for one subcategory versus another.

         

         

        It would be nice to overlay time on this, for example day of the week, or week of the month.

         

         

        All pretty typical grocery retail analysis.

         

         

        Best wishes,

         

         

        Jonathan

        • 16. Re: Matrix of product lift
          Jonathan Drummey

          Hi Jonathan,

           

          I think I came up a with a modification of Dana's solution that that ends up being a slight improvement. When evaluating different options, I was considering the following dimensions:

           

          a. (relative) ease of initial setup for the data, the calculations, and the views.

          b. responsiveness to changes in the view (i.e. adding dimensions to the view such as a date/time dimension)

          b. data volumes

          c. performance (particularly w/regards to how much data has to be processed by Tableau)

           

          On these dimensions, the solution Dana proposed:

           

          a. Uses a self-join that can be built using an unmodified data source and a multiple tables connection in Tableau (i.e. entirely drag&drop except for typing in names). The calculations are built using FIXED LOD expressions which are great for being very clear about the logic used for each aggregation.

          b. Using FIXED LODs makes the view harder to change because introducing a new dimension (such as a date) would require changing most every FIXED LOD. In addition, when using FIXED LODs we have to be more aware of Tableau's order of operations since FIXED LODs are aren't affected by regular dimension filters, we need to use filters that are processed earlier in the pipeline such as context filters.

          c. Increases data volumes quite a bit by creating a data set with a record for every combination of products in each receipt.

          d. This might affect performance, however since the data has all the granularity necessary the calculations can be run directly in the data source with a minimum of overhead.


          By comparison, the blend based solution I linked to earlier where a scaffold source is used has the following characteristics:

           

          a. Setup is more difficult because of the need to use domain completion, data blending, and table caculations.

          b. Changing the view is more difficult because of the need to deal with changing linking dimension(s).

          c. Data volumes are kept to a minimum. This is the only advantage the blend based solution has over the join solution.

          d. Data blends are good for somewhere between 10s of thousands and low (potentially very low) 100s of thousands of blended items (receipt/product combinations in this case). Note that this is dependent on how many blended items there are a for a particular view, not the whole data set (i.e. it's after most filters have been applied).

           

          I explored some other possibilities, but given the need to be counting distinct receipt ID's none of them made really made sense (they mostly made the setting up the data source quite a bit more complicated) except for one that is a variation on what Dana proposed, namely using EXCLUDE LODs instead of FIXED. Here's my logic as to why:

           

          a. The setup of the data connection is the same inner join. The calculations are built using EXCLUDE LODs which takes a little more effort because we have to think about what we *don't* want in the dimension declaration rather than the FIXED where we just think about what we *do* want in dimension declaration.

          b. This is where EXCLUDE LODs have a distinct advantage over FIXED LODs. They are dependent on the dimensions in the view, so given a view we can drag in another dimension like MY(Date) and the view updates without needing any changes to the calculations.

           

          Here's a heatmap of the Exclude-based calcs:

           

          Screen Shot 2015-07-26 at 7.55.12 AM.PNG

           

          And here I've dragged in MY(Date of transaction) in two clicks (right-click & drag, then selecting the MY aggregation):

           

          Screen Shot 2015-07-26 at 7.56.51 AM.PNG

           

          That's way faster than having to edit 4 calculations, plus if I want to filter then I don't have to go through extra steps to use context filters.

           

          c. There's no change here.

          d. Also no change.

           

          I prepared two versions of the ultimate calculation. One uses aggregations, the other (the one used in the screenshots above) takes advantage of two useful attributes of INCLUDE & EXCLUDE LODs:

          1) We can use {EXCLUDE : (some aggregation)} to return the results of an aggregation at the vizLOD to each mark as a record-level result.

          2) We can do record-level computations of the results of EXCLUDE & INCLUDE LODs and then aggregate them in the view. In other words, even though we can only use EXCLUDE & INCLUDE LODs in the view as aggregated measures, when writing calculations we have the option to work with them as either record-level or aggregate.

           

          I've attached the modified workbook. Thanks to Dana for the original solution, and I hope this modification and/or the explanations of my reasoning is helpful to you!

           

          Jonathan

          • 17. Re: Matrix of product lift
            jonathan.o'grady.0

            Jonathan,

             

             

            This is beyond helpful! Bravo! It's proven to be quite the riddle, but you really seem to have found an efficient solution.

             

             

            I will be sure to try it out. I love the way that you have taken into consideration the volume of my data.

             

             

            Hopefully many others will find this interesting and useful.

             

             

            Very grateful yours,

             

             

            Jonathan

            • 18. Re: Matrix of product lift
              Dana Withers

              Wow! Thanks to both Jonathans! This is a great question and a fantastic answer!

              I'm not too familiar with Excludes yet, but I'll be sure to look into that further. Glad I was on the right track though.

              I really enjoyed this one

              Thanks!

              • 19. Re: Matrix of product lift
                jonathan.o'grady.0

                Dana,

                 

                 

                Hope you had a nice holiday?

                 

                 

                Glad you got a kick out of that particularly nasty little problem! I have an easier one that no one is paying any attention to:

                 

                 

                http://community.tableau.com/message/399483#399483

                 

                 

                I think there is a simpler solution… At least I hope so!

                 

                 

                Any ideas?

                 

                 

                Would appreciate any input!

                 

                 

                Best wishes,

                 

                 

                Jonathan

                1 2 Previous Next