9 Replies Latest reply on Sep 21, 2016 9:53 AM by David Li

    How to count distinct users on a running period with different conditions?

    Betul Ferikoglu

      Hi,

       

      I couldn't come up with a better title. Let me explain my question:

       

      My playground is a marketplace and our users are buyers or sellers.

       

      1- I need to calculate the cumulative number of (distinct) buyers; which is OK.

       

      2- But I also need to calculate the "cumulative number of buyers who purchased only once", "cumulative number of buyers who purchased twice", "cumulative number of buyers who are also sellers" etc.

       

      The problem is RUNNING_SUM(COUNTD([BuyerId])) doesn't give you the actual distinct count. This has been previously discussed here: https://community.tableau.com/message/197406

      So, I've calculated the cumulative count distinct of buyers by using row_number in custom sql.

       

      I have no idea about the calculations in 2. bullet point. Can anyone help on this please?

        • 1. Re: How to count distinct users on a running period with different conditions?
          David Li

          Hi Betul! I think you can do this using LOD calculations. To make all this work, we'll first make a LOD calc that figures out the earliest entry for each buyer. Let's call this calculated field MinDate.

          { FIXED [BuyerId] : MIN([Date]) }

          Now, we add up the distinct buyers per date only for each buyers' inaugural date. Let's call this CountDistinct. This is what you'll do a running sum on.

          { FIXED [BuyerId] : SUM(IIF([Date] = [MinDate], 1, 0)) }

          To get the specific counts you specified, you'll need to do some other things. For instance, to figure out which buyers are also sellers, you might just add a filter that cuts out all the records but the sellers.

           

          To get the number of buyers who purchased only once or twice, you can create LOD calculations that get the total count of sales/days.

          { FIXED [BuyerId] : COUNTD([Date]) }

          Then, you can just use that as a normal filter.

           

          Let me know if this works for you!

          1 of 1 people found this helpful
          • 2. Re: How to count distinct users on a running period with different conditions?
            Betul Ferikoglu

            Hi David,

             

            Thanks for your response.

             

            Looks like I was not perfectly clear with my question.

             

            I was able to resolve cumulative count distinct issue by using row_number function in custom sql.

             

            I've numbered all purchases made by a buyer with this function :

             

            ROW_NUMBER () OVER ( PARTITION BY BuyerId ORDER BY OrderDate ASC) as Row
            

             

            Then simply created a calculated field as : RUNNING_SUM(COUNT ( IF [row] = 1 THEN [BuyerId] ELSE NULL END))

             

            This works fine for my first bullet point.

             

            However, when it comes to the specific calculations like number of purchases made, mentioned in my second bullet point, neither my workaround nor your solution works.

             

            For example, when I use  { FIXED [BuyerId] : COUNTD ([OrderDate]) } to calculate the number of purchases for each buyer; and then filter for number of purchases per buyer, I get incorrect numbers. Because I need the running count of buyers. The correct numbers should look like this:

             

            Buyer TypeJan 2016Feb 2016March 2016
            April 2016
            # of Buyers purchased only once15K20K25K30K
            # of Buyers purchased twice5K7K10K 12K

             

            However, I get something like this:

             

            Buyer TypeJan 2016Feb 2016March 2016
            April 2016
            # of Buyers purchased only once3K2K2.5K2K
            # of Buyers purchased twice1K1K1.5K 1.2K

             

            Because the calculation returns the number of buyers who purchased only once on that specific month, lets say January 2016, while I need the number of buyers who purchased only once until the end of January 2016.

             

            This the same for other calculations like buyers also sellers etc.

             

            I hope this time my question is more clear.

            • 3. Re: How to count distinct users on a running period with different conditions?
              David Li

              Thanks for the clarification! Do you have a packaged workbook you can share so I can see how you're implementing the formulas? For instance, the { FIXED [BuyerId] : COUNTD ([OrderDate]) } is actually supposed to be used as a filter, not as the Text mark itself.

              • 5. Re: How to count distinct users on a running period with different conditions?
                Betul Ferikoglu

                I've prepared a sample. As mentioned, I used custom sql (row_number) in my original workbook for cumulative number of buyers. On this sample, I tried your suggestion but couldn't make that work either. I really appreciate your support.

                • 6. Re: How to count distinct users on a running period with different conditions?
                  Betul Ferikoglu

                  Thanks but that doesn't help. I also use this calculation, but need cumulative numbers.

                  • 7. Re: How to count distinct users on a running period with different conditions?
                    David Li

                    After looking at the sample, I noticed that I forgot one of the dimensions needed in the LOD calc. This gives you the correct value for doing a running sum of distinct buyers without needing to mess with table calcs:

                    { FIXED [BuyerId], [OrderDate] : SUM(IIF([OrderDate] = [MinOrderDate], 1, 0)) }

                    Now, I do need clarification on one thing. Let's say we're calculating the number of buyers who have purchased three times. Do you expect that buyer to be counted starting in the month when he first made any purchase, or when he made the third purchase?

                    • 8. Re: How to count distinct users on a running period with different conditions?
                      Betul Ferikoglu

                      Running sum of distinct buyers works now, thanks!

                       

                      I expect that person to be counted starting in the month.

                       

                      To provide more clarification, let me give an example. Let's say my first month is January 2016  and I have one buyer, who purchased something on Jan, Feb and April. My numbers should be:

                       

                       

                      JanFebMarchApril
                      Cumulative # of Buyers1111
                      Cumulative # of Buyers who purchased only once1000
                      Cumulative # of Buyers who purchased twice0110
                      Cumulative # of Buyers who purchased 3 times0001
                      • 9. Re: How to count distinct users on a running period with different conditions?
                        David Li

                        Okay, I have a solution, but it does require multi-level table calculations. First, here's the output I get:

                        This is in the "Output" tab. To figure out what's going on, see the "Explanation" tab. Let's just look at the unique buyer count and the Purchases = 1 to make it simpler.

                        We can't do what you want to do with the purchase count using LOD calculations because they can't really be used like window functions. So in order to count up the number of buyers who purchased N times from the start through the month in question, we have to look at each buyer independently and then use table calculations to sum up the results.

                         

                        But let's start with the unique buyer count. I bet you already understand this, but for others' sake, this is calculated by using a table calc version of the trick we used earlier. If OrderDate = MinOrderDate, then we count the buyer. Then, we do a window sum along Table Down to sum up the buyers. Then, we do a window sum of the current month and previous months to get the rolling sum.

                         

                        We also use this framework to get the purchase count measures. We start with [Purchases (Rolling)], which is a rolling sum of the total purchase count of each buyer. This must be computed along OrderDate (which is also Table Across here).

                        WINDOW_SUM(SUM([Number of Records]), FIRST(), 0)

                        Then, we use a rolling sum to check whether that buyer has N cumulative purchases as of that month. This also must be computed across.

                        IIF(WINDOW_SUM([Purchases (Rolling)], FIRST(), 0)=1, 1, 0)

                        Finally, we take a window sum along Table Down to add up the buyers.

                        WINDOW_SUM([Purchases = 1])

                        Rinse and repeat for other values of N.

                         

                        The last step is to hide all but one row per measure by using a table calculation filter.

                        LAST() = 0

                        This should compute along Table Down and be TRUE. And then it's just a matter of hiding the headers for BuyerId.