3 Replies Latest reply on Aug 27, 2018 12:38 PM by swaroop.gantela

# Distinct Count over date period range

DateSales
CustomerId
2018-01-018000cust01
2018-01-017000cust02
2018-01-019000cust02
2018-01-019500cust03
2018-02-013500cust01
2018-02-014500cust02
2018-02-016500cust03
2018-02-014000cust04
2018-02-018500cust03
2018-03-014500cust01
2018-03-019000cust04
2018-03-016000cust02
2018-03-013000cust05

I am trying to get the distinct count over a period of 12 months on a rolling 12.

In this example, I have 3 months of example data. For the example, I would like to get the distinct count of CustomerId on a rolling 2. That means that February's distinct count on a rolling 2 should be 4 (cust01, cust02, cust03, and cust04) and March's distinct count on a rolling 2 should be 5 (cust01, cust02, cust03, cust04, cust05). The final result is the sum of Sales on a rolling 2 divided by the distinct count on a rolling 2 for the respective month.

For the Sales (since I am not dealing with distinct count), I am using window_sum() but I get stuck when calculating the distinct count of the CustomerId since the window_sum(countd([CustomerId]), -1, 0) function will bring up duplicate counts (e.g. 3 + 4 = 7).

• So, the question is how do I avoid these duplicates and do a distinct count over a date period?
• ###### 1. Re: Distinct Count over date period range

Samuel.

Wow. These are deeper waters than I knew:

How to count distinct users on a running period

I employed the method in the workbook posted by Yuriy Fal and excellently explained by him here:

Re: How to count distinct users on a running period

This got the correct rolling customer count.

I duplicated the method to get the rolling sales, changing the [In Window] to be :

WINDOW_SUM(SUM([Sales]),1-[Interval],0)

3 of 3 people found this helpful
• ###### 2. Re: Distinct Count over date period range

Thank you! This is awesome.

But now it gets deeper. I manage to implement this technique in my dashboard by following your advice. But I forgot to mention the size of the real dataset. So, the other challenge is handling a 23,485,632 rows view with this technique. I tried to do reduce the data by grouping before loading to Tableau and following the technique but It did not improve the dashboard substantially.

Any thoughts?

• ###### 3. Re: Distinct Count over date period range

Samuel,

This ocean is deep and wide.

I can imagine that the rolling calculations are bogging down performance.

But I'm not sure if there is another method that can avoid table calculations, and yet still get the desired result.

Some requests:

1. Please describe what each row in your true dataset represents: is it every customer transaction?

Then in your aggregation, does each row look like what you have in your sample:

for each month the total of sales by customer?

2. Are you using Tableau Prep for the aggregation? Can you post a screenshot of your flow scheme?

3.  Would it be permissible for you to post a screenshot of your workspace, including the pills

and their locations on the filter shelf, row/column shelves, etc.? If the fieldnames themselves

are sensitive, you can black them out.

4. You may already be familiar with these, but if not, here are some links to general tips on improving performance: