5 Replies Latest reply on Sep 27, 2016 11:17 AM by Parris Khachi

    Partitioning in Tableau - Detecting Returning Users

    Parris Khachi

      Soooo in impala I have the following query:


      select id, count() OVER (PARTITION BY user_id order by id rows between unbounded preceding and current row) AS quantity_created_by_user from articles where type=1

      What this query does is return a pair of values (id, quantity_created_by_user). This pair essentially represents a rolling count for each article of how many times a user has created an article up until that point. What I want to do is try to do is see a line graph that shows me how many users that created articles have a quantity_created_by_user count greater than 1 over time. That is, on January 20th, 2016 I'd see that 20 returning users created articles and I can see that the week after we declined a little bit in our return user numbers.


      My current "Column" calculation reads: "DATE([Published])" and I've configured it to give me weekly partitions.
      My "Row" calculation needs a bit of work. I tried to take the above and convert it to something wrapped by RAWSQLAGG_REAL, but that doesn't seem to work. I also know that impala doesn't like subqueries in the select list. So it doubley won't work.


      Is there a way to do this easily in tableau? Is there another way to approach this issue?

        • 1. Re: Partitioning in Tableau - Detecting Returning Users
          Russell Christopher

          I suspect there are some subtleties here I don't quite understand, but it seems like you're making your life unnecessarily difficult by trying to use rawsql functions.


          If you're writing Custom SQL anyway, can't you simply extend the first SELECT statement to include your date column and do something like:


          SELECT id, datepart("week", Published) as week , count() OVER(.....) as quantity_created_by_user FROM articles WHERE....GROUP BY 2


          Perhaps you can explain the logic in your OVER window function - why are you partitioning this way?


          Never mind, I wasn't paying attention. You are partitioning because you need the user to have visited your site the previous week before you even "care" about them this week. Duh. Just about to run out for a walk, and will type more later, but you might want to consider using Table Calculations or Level of Detail Calculations for this work - they can do the partitioning you want,

          • 2. Re: Partitioning in Tableau - Detecting Returning Users
            Russell Christopher

            Hey, any chance you can post some simple sample data in an excel worksheet? Just enough to show what the "base" records look like. There are a bunch of different ways to solve this, but I'm still not 100% clear on what the data looks like and don't want to waste my/your time by helping you solve the "wrong" problem. Let's not worry about your query for now, just show what the data looks like in Impala, and the RESULT you want to see.

            • 3. Re: Partitioning in Tableau - Detecting Returning Users
              Parris Khachi

              Hey no problem, thanks for the help.



              Here is a piece of the csv:












              I can also add in user_ids and such, but that isn't really needed. Basically trying to answer "how many events were created today by returning users. I'll take a look at the docs for Table Calculations and Level of Detail calculations. Let me know if you have any tips .

              • 4. Re: Partitioning in Tableau - Detecting Returning Users
                Russell Christopher

                Thanks. Questions:


                • Does each record represent a week? Does your system actually associate time with these values? How?
                • I don't see any of the (user) id values repeating at all, which tells me none of these are repeat users? Maybe this sample needs to enriched a little more?
                • 5. Re: Partitioning in Tableau - Detecting Returning Users
                  Parris Khachi

                  Each row in the records above represents an individual occurrence of an article created. The quantity_of_articles_published_previously column is a rolling count I'll post an example below.


                  id, user_id, quantity_of_articles_published_previously, published
                  1, 1, 1, 2016-01-01
                  2, 2, 1, 2016-01-02
                  3, 1, 2, 2016-01-03
                  4, 1, 3, 2016-01-04
                  5, 1, 4, 2016-01-05
                  6, 2, 2, 2016-01-06

                  Nothing about this is weekly. I've been trying to get tableau to handle the "weekly" part. Although I could potentially handle it in query.


                  Does that help? Thanks!