10 Replies Latest reply on May 1, 2015 7:01 AM by Jonathan Drummey

Charting new distinct values per period?

Basically, I have a list of clients by the month that they were active. Attached is a simplified, mocked-up workbook.

I'd like to be able to chart the number of new clients (those who did not appear in the data before a given period) by period (month, quarter, year)--both absolutely ("there were 5 new clients in Q4 2014"), and compared to specific previous periods ("there are 25% more new clients in Q1 2014 than there were in Q4 2014").

Your help is very much appreciated!

• 1. Re: Charting new distinct values per period?

Daniel,

There are a couple of ways to do this.  With Tableau 8, I would use table calculations.  The basic idea is to use the minimum month for each client as that will be the first time they show up.

The Tableau 8 Way:

Now, I can use the Size() function to count the number of clients in each MIN(Month).  To enable MIN(Month) to be used as the partition of the table calculation, I have to right click the field on Rows and uncheck Ignore in Table Calculations.

Then, I'll create a calculated field called New Clients with the code:

Size()

When I set the Compute Using to Client ID (thus making MIN(Month) the partition), the result is the number of clients who were first seen that month.

Now, I can clean up a little.  I'll edit the code of the function to:

IF FIRST() == 0

THEN Size()

END

That way, we only have to calculate once per partition.

And now I can re-arrange the view:

Notice that I have to keep Client ID in the level of detail for the view.  But I can move the fields around however I'd like and change the mark type however I'd like to get almost any view I want.

I've attached the 8.2 workbook so you can see.  I'll give  a post in a bit to show how this gets to be a lot easier in 9.0.

Regards,

Joshua

The Tableau 9.0 Way

This demonstrates the power of LOD (Level of Detail) calculations, because I can do the same thing I did in 8.0 with table calcs -- but it takes far fewer steps and is far easier to understand.

First, I'll create a FIXED level of detail calcualtion to get he first date per client:

The code is: {FIXED [Client ID] : MIN(Month) }

The fixed level of detail calculation tells tableau to get the minimum date for each Client ID.

Then, we can use a second calculation to compare that result to the Month of the record of data.  If it's true, we'll get the Client ID and NULL otherwise.  That way, we can do a distinct count of these per month and get the number of new clients in a given month:

The code is: IF [First Date for Client] == Month THEN [Client ID] END

And then, we just do a count distinct of those result in the view and we're done.  Way easier than version 8.  Thank you Tableau Developers!

Regards,
Joshua

1 of 1 people found this helpful
• 3. Re: Charting new distinct values per period?

Thank you so much! This was really helpful and has gotten me much further.

Another question: I've made a bunch of date-filtering calculations, with goal of filtering the view to only include specific periods and comparing the number of new clients between those periods.

For instance, to compare this quarter vs the last quarter:

Why is it that when I pull this field onto the columns shelf, the numbers change even though all it seems to adding to the view is additional column headers? How could I mitigate this?

Without the new field:

and with the new field (date parameter is set for Q4 2014):

Attached the workbook with the additions. Thank you again!

• 4. Re: Charting new distinct values per period?

Any new dimension you add to the view will "break" the table calculation because it is set to compute very specifically (along Client ID). There are a couple of options:

1. Readjust how the table calculation is computing.  You'd have to set the advanced settings.
2. Probably Easier: just make This Quarter and Last Quarter some kind of aggregation in the view.  Right-click it and set it to either ATTR (attribute -- a special aggregation) or set it to Measure > Min (or Max).

The reason #2 works is that aggregations, by default, don't impact table calculations.  We had to force that using the Ignore in Table Calculation option for the MIN(Month) field -- but if don't set that option for your new field used as an aggregation, it won't cause the issue you are seeing.

Hope that helps!

Regards,

Joshua

• 5. Re: Charting new distinct values per period?

I'm zippin' along and read: 1. Read just the table calc.... Ha! Never was good at skim reading. As you pointed out before, Josh, this a great example of why LOD expressions are going to make life so much easier!

Cheers,

--Shawn

PS: How's the book coming? Soon?

• 6. Re: Charting new distinct values per period?

Thanks Shawn!  The book is close - I've submitted all final chapters, fixed graphics, etc...  Most of it is out of my hands now (just a little more work on the preface )

-Joshua

• 7. Re: Charting new distinct values per period?

Josh,

When you get a chance, see if you can apply the Size() and LOD to this case

Vizible Difference: Dynamic Histogram Over Time

I tried size() and didn't find a solution yet.

Thanks!

• 8. Re: Charting new distinct values per period?

I worked through it and I can't really think of any major optimizations.  Size() won't work because it would require partitioning by a table calculation (Fill Status) and I'm pretty sure LODs are out because the solution requires data densification (which happens after LODs) and also requires a moving calculation.  LODs can typically replace table calcs that happen to the entire window at once (WINDOW_SUM, WINDOW_AVG, etc...), but not calculations that move within the window (Previous_Value, RUNNING_*, LOOKUP, RANK, etc..)

Hope that helps!

Regards,

Joshua

• 9. Re: Charting new distinct values per period?

Thanks!

On Thursday, April 30, 2015, Joshua Milligan <

• 10. Re: Charting new distinct values per period?

Hi Alexander, I got curious about your post and checked it out, nice work! I came up with a couple of alternatives, one is a simplification and the other uses a different route to get the count. See http://drawingwithnumbers.artisart.org/counting-from-nothing-a-double-remix/ for details.