-
1. Re: calculate count distinct according to time
Alex Kerin May 2, 2012 7:21 AM (in response to chandler.zhang)This should be straightforward, but I cannot get my head around it.
I can get part of the way there with: lookup(attr([Install]),first())
but then cannot think of a way to sum the values (you can't partition a table calc by another). Maybe Richard Leeke or someone else can help.
I tried using a filter to get rid of the other appearances of the id after the first date, but to no avail either.
-
Book3.twbx.zip 28.4 KB
-
-
2. Re: calculate count distinct according to time
chandler.zhang May 2, 2012 7:49 AM (in response to Alex Kerin)There's a solution in
http://community.tableau.com/message/168597#168597
I have downloaded the twbx file. But I don't know how he did it. I tried to create the calculated fields as he did. But I got a mass in my result. I am trying to check what I missed
-
3. Re: calculate count distinct according to time
Jonathan DrummeyMay 3, 2012 8:45 AM (in response to chandler.zhang)
Hi Chandler,
Attached is an implementation of Joe Mako's calcs using a variation of Alex's sample data that includes some duplicate records to make sure things work right.
Instead of trying to use a count distinct that would have to be compared, what Joe set up is:
1. Identify the earliest month for each user in the data set - that's the Min Month calc, for each of those return 1 no matter how many records there might be, return 0 for every other month/everyone else. This effectively gets the "distinct" part of the count distinct.
2. Sum up the Min Month for each month - that's the Min Month calc, so now we have an accurate count of new users in the month.
3. Create a running sum Min Month - that's the Running Sum calc.
Since you're wanting a count over months, I took the Install (date) field and used the DATETRUNC function to create a Month of Install field. This makes setting the Compute Using for the table calculations easier.
I'm guessing that the problem you are experiencing in your results is in the addressing and partitioning of the (nested) table calculations. They have to be exact or else the calculation fails. In this case, there are three calculations to be set when you go to the Edit Table Calculation on the Running Sum field, by clicking on the Calculated Field combo box: Running Sum, Month Sum and Min Month.
Starting with the lowest level, Min Month needs the Compute Using set to Month of Install. This calculates using a method I hadn't seen before, I'm going to have to add it to my toolbox. Using TOTAL is very cool because it doesn't care about sorting in the partition, which is something that can drive me batty in nested table calcs.
Then the Month Sum needs the Compute using set to ID. This simply sums up the # of Min Months for each month.
Finally, the Running Sum needs an Advanced... compute using with *both* Month of Install and ID in the Compute Using, with the sort set to Month of Install/Min/Ascending. This then calculates across all combinations of Month of Install/ID, in the correct order.
Jonathan
-
4. Re: calculate count distinct according to time
dhveryoldaccount May 3, 2012 9:33 AM (in response to chandler.zhang)Attached would be my approach. Let me know if it helps.
Basic idea being to just put the "Number of Records" measure on the rows shelf, "install date" on the columns shelf.
-
zhangexample.twbx.zip 17.4 KB
-
-
5. Re: calculate count distinct according to time
Alex Kerin May 3, 2012 9:34 AM (in response to Jonathan Drummey)NICE! That total(min is the key that I had missed.
-
6. Re: calculate count distinct according to time
Alex Kerin May 3, 2012 9:35 AM (in response to dhveryoldaccount)The issue is that once someone has installed it, another entry for them at a later date should not be counted. Hence the complexity with the Table Calcs.
-
7. Re: calculate count distinct according to time
dhveryoldaccount May 3, 2012 9:54 AM (in response to Alex Kerin)Oh, yes, you're right. I was reading it as installs per day regardless of which user installs. So then why wouldn't COUNTD ([user name]) be sufficent?
-
8. Re: calculate count distinct according to time
Jonathan DrummeyMay 3, 2012 10:12 AM (in response to dhveryoldaccount)
Hi Derek,
The goal as I understand it was to count the distinct (new) installs for each day, and since the data doesn't flag which rows belong to a new install or not, regardless of user, we have to generate that from the data. In the sample data that Chandler initially posted, a COUNTD(username) would return 3 on Jan 1 and 3 on Feb 1, and the goal is to show 3 on Jan 1 and 4 for Feb 1 (the 3 from Jan 1 and the single new user on Feb 1).
Jonathan