11 Replies Latest reply on Feb 4, 2019 7:57 PM by Alex Kerin

    Window sum

    Alex Kerin

      In the attached workbook (Superstore) I've created tables of customer counts of when the customer first ordered something, and last ordered something - it's the beginning of a churn calculation (if a customer hasn't ordered something in the last x months, we consider them churned). So, in March 2017 we had 6 customers order for the first time (Month of First), and 2 customers who never ordered anything again (Month of Last).

       

      What I want to do is have a table like this:

       

      Month First time customer countLast Time Customer Count
      March 201762
      April 201772
      May 201775
      and so on

       

      I can begin to create this via table calculations (see totals sheet), but have to find the correct cell that lies along the diagonal intersection of first month and last month:

       

       

      Any idea how I would collapse this so that I only show one column for each date and has the number that's at the intersection currently?

        • 1. Re: Window sum
          Hari Ankem

          Is this the expected output?

          1.png

           

          If yes, then all I did was to include the distinct customer count only in the Text Marks.

          1.png

          • 2. Re: Window sum
            Alex Kerin

            I don't want the data spread over the columns - I want just one date field in the view - like the table example I posted. Thanks for looking though.

            • 3. Re: Window sum
              Hari Ankem

              You have 2 different dates and they can span different periods of time. Ideally, we would need to pivot it to get the dates into a single column, but in this case pivoting would not be possible. So, you can look at joining the data with a unique list of months and proceed from there.

               

              Here is the output of what I have now:

              1.png

              I have joined the data with a unique list of months as shown below:

              1.png

               

              And created the following calculated fields:

              First Time Customer Count:

              COUNTD(IF [Months]=DATETRUNC("month",[Customer First Order]) THEN

                  [Customer Name]

              END)

               

              Last Time Customer Count:

              COUNTD(IF [Months]=DATETRUNC("month",[Customer Last Order]) THEN

                  [Customer Name]

              END)

               

              Attached the data file I have created for unique months and the updated workbook.

               

              Hope this helps now.

              • 4. Re: Window sum
                Alex Kerin

                Thank you Hari - I suspect a cross-join may be the only answer. I was hoping there was an LOD method around this. As I don't want to increase the row count (your cross-join increases the rows from ~10,000 to ~480,000), I'll probably use custom sql to return the last order date for every customer and join on the customer name.

                 

                EDIT: Though actually I don't think my suggestion will work

                • 5. Re: Window sum
                  Hari Ankem

                  OK. Anyway, since you have a solution now, you may want to close this thread.

                  • 6. Re: Window sum
                    Jonathan Drummey

                    Hi Alex!

                     

                    This kind of problem is one where the data is in a transactional form (one row per transaction per customer) and the "ideal" form to answer our question is something ...different. There are multiple ways we could describe what that different form is, exactly. What I tend to do is think about what I'd want to use as a dimension or measure in my final view, and then figure out how that relates to the data that I presently have, then figure out the transformations to get there.

                     

                    In this case one ideal would be a date (month) dimension that would include all the values of first order date and last order date we could put on Rows as a discrete pill and either a dimension for first/last order with COUNTD(customer) as a measure or a couple of measures.

                     

                    Also a challenge in views like this is what happens when the data is sparse? In the case of superstore there are first order date months with no corresponding last order date or vice versa, so we need to consider those as well.

                     

                    Finally there's thinking about the complexity of the resulting Tableau views (at the intermediate levels(s) of detail needed to get the final displayed marks). Using the intersection of first & last order date fully padded out in Superstore using Show Missing Values and Tableau's date densification creates a view with 1833 marks (47 first order months * 48 last order months) and then in order to interact with the months to identify the intersections we need to work with padded months (which require two levels of nested table calculation), then flag the desired intersection, so I'm counting 7 table calculations at a minimum. Plus the original view was using totals, and totals for table calculations with padding get really hairy-scary really fast and can potentially require reworking *every other calculated field*. So building  the view in this manner requires knowledge of Tableau's order of operations, data densification (which has never been fully documented), custom table calculations, data densification within grand totals, potentially custom grand totals, and all it would take is a single mis-click on one of the compute using settings to blow up the whole thing.

                     

                    So to my mind the easier way is to go back to the ideal data source and build that more directly. In one sense what we need to do is to merge (as in union) the lists of first order dates and the list of last order dates, and we can do that by unioning the data together. In Superstore I did a self-union (which adds a Table Name dimension) and then created the following date calculation that generates a list of first and last order months:

                     

                    CASE [Table Name]

                        WHEN 'Orders' THEN {FIXED [Customer Name] : DATE(DATETRUNC('month',MIN([Order Date])))}

                        WHEN 'Orders$' THEN {FIXED [Customer Name] : DATE(DATETRUNC('month', MAX([Order Date])))}

                    END

                     

                    I also duplicated the Table Name dimension and used aliases as first/last order month.

                     

                    Then building this view that shows every month with either first or last order date and grand totals was a matter of a few clicks:

                     

                    Screen Shot 2019-02-04 at 7.07.04 PM.png

                     

                    So by changing the structure of the data to more accurately reflect the question we are trying to answer the resulting view is a lot simpler, it just uses a single calculated field and a regular aggregate measure. This view does have some possible challenges around sparse data, e.g. showing 0's when there are no first or last customers in a given month or showing a month when there were none of either type of customer in that month, but those are a lot easier to deal with and require less knowledge than the other method.

                     

                    v2018.3 workbook is attached, hope this helps!

                     

                    Jonathan

                    • 7. Re: Window sum
                      Jonathan Drummey

                      PS: I saw that Hari Ankem had responded, he was on to something with the self-pivot idea, which in outcome would be what I proposed with the union - either way we end up with a source with one row per first/last month & customer. However Tableau Desktop doesn't presently support pivoting on calculated fields, though Tableau Prep does. Though in Tableau Prep I'd actually use a different flow - just split the data into two streams to do the two aggregations (first and last, respectively) and then just union those results together.

                      • 8. Re: Window sum
                        Alex Kerin

                        Thank you both - it makes sense that there's no way without a join/union - I had been hoping to avoid row number increases and the issues that can bring, but I'll build a source just for this analysis.

                         

                        Attaching the end result of this with churn/net churn based on lost customers and cumulative balance of customers. The idea is that you choose a period of time that you consider a customer as lost, discard any recent months less than this and build of picture of customer gains and losses.

                        • 9. Re: Window sum
                          Michel Caissie

                          Hey guys,

                           

                          May I jump in the discussion.

                           

                          The first order dates and last order dates  all comes from the order date dimension.  So I don't see a problem getting those numbers without having to do any join.

                           

                          Just create two calcs

                          COUNTD( if [Order Date] = [Customer First Order] then [Customer Name] end )

                          and

                          COUNTD( if [Order Date] = [Customer Last Order] then [Customer Name]  end )

                           

                           

                          Michel

                          • 10. Re: Window sum
                            Jonathan Drummey

                            Ooh, I like it!! Nice work!

                            • 11. Re: Window sum
                              Alex Kerin

                              So I think the lesson here is don't try to tax your brain on Superbowl Monday. Michel - thank you - I had got very tied up with using that Month of Customer First Seen that I overlooked switching back to the order date.