1 2 Previous Next 16 Replies Latest reply on Jan 31, 2015 4:58 AM by Jonathan Drummey

# Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

There are various cases for analyses of cohorts where we want the earliest, first, least, latest, last, and/or maximum of some field available to other rows in the data for calculations.

In this case, using Superstore Sales the goal is to end up with a # of days since the first Order Date for each subsequent Order Date for each Customer Name. So we need to make the first Order Date for each Customer Name available to other calculations.

There are 4 solutions demonstrated in this workbook:

- Blend Diff. A self-blended data source is used to provide the first order date as an aggregate.

- Subquery #1 - agg Diff. A subquery using a GROUP BY clause is used to get the first order date as a dimension.

- Subquery #2 - JOIN Diff. A subquery using a self-join is used to get the first order date as a dimension, in an indexed data source this can be faster than a GROUP BY.

- TC Diff. A table calculation is used to generate a measure for the first order date, then a datediff calc gets the data.

Can you come up with any additional solutions? I'm particularly interested in solutions that don't require Custom SQL.

• ###### 1. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

Hi Jonathan,

I don't have a suggestion on a different means of computing the cohort group, but I think you'll like this trick I'm about to reveal. One challenge with your calculations for cohorts that are based on data blending or table calculations is that they are not easily used in further analysis at a different level of detail. The reason is that the calculations are affected by the level of detail in the viz, and may not be possible to isolate. So what I like to do is capture cohorts into ad-hoc groups using this technique:  Creating Primary Groups from a Secondary Source Using Data Blending | Tableau Software

Once the ad-hoc group is defined you can use it in other analysis, e.g. to perform a count distinct of the customers in each cohort group. Take a look at the workbook I attached.

I hope this helps,

Robert

2 of 2 people found this helpful
• ###### 2. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

I thought of a curious way to do this with densification. Unfortunately, it is a bit delicate with regard to layout, but I think it qualifies as another approach so I added it to the workbook. If you turn on domain padding, but not domain completion, then you can literally count the days with an index (technically index-1 since the first day is day 0). I further modified my calculation to assign -1 on days when the customer had no sales so that I could easily hide them from the view.

2 of 2 people found this helpful
• ###### 3. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

Hi Robert,

Thanks for reminding me of primary groups! Honestly, I often forget about them because I rarely use them, the reason why is that they don't dynamically update. In my work I need to minimize the number of repetitive tasks, so for me it's worth putting some extra effort in the beginning into in a table calc, a data blend, and/or the underlying query/view to make the analysis/view responsive to changes in the data that occur every month or quarter. Then the extract refresh will automatically pick up the changes without me having to intervene.

In a more general sense, I think this kind of problem is a use case that drives feature requests like the idea for using a table worksheet as a data source http://community.tableau.com/ideas/1604, or multi-pass aggregations. We want to use Tableau to do some set of computations and transformation of the data, and use Tableau on that data to do another set of computations that are either beyond our skill level, just plain impossible (which given Tableau's flexibility usually means "I don't know how to do this"), or too computationally expensive to do in the first pass, *and* we want that computational pipeline to stay up to date and refreshed with a minimum of effort.

Jonathan

• ###### 4. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

Hi Jonathan,

I agree that such operations need to be dynamic to support advanced use cases without requiring repetitive, manual effort. However I have reservations about using an existing worksheet as a data source. That approach would be so general that it could never represent the specific intent of an analytical workflow, making it very challenging for Tableau to optimize the workflow. Similarly, powerful features like data blending and table calculations can be a mixed blessing due to the complexity of their interactions with other features. I'd love to see an elegant solution to multi-pass aggregation and other features our customers have been requesting.

-Robert

• ###### 5. Re: Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

> I'd love to see an elegant solution to multi-pass aggregation and other features our customers have been requesting.

Me too!

I agree the "worksheet as a data source" idea has endless opportunities for confusion, and I imagine that concept fits with what most users know from tools like R, SAS, and Excel that effectively encourage transforming data in a series of steps. With the 8.1 R integration and my work on funnel plots Not a funnel cake, nor funnel charts, it’s funnel plots! | Drawing with Numbers I've been thinking more about how Tableau's "all at once" model requires us to conceive of what's happening at multiple levels of detail at the same time, and where I start to go with my own wishes is to have the interface be amenable to opening up to "teach" the user about what is going on. For example, a user could do a data blend just like they do today, but if they were confused then they could access a sub-view that would show them the dimensions in play and the aggregations used to help them understand how the data lines up. For a table calc, that sub-view would include information on the sorting, addressing, and partitioning, for an R calc it would show the same. Also, this could help users who are new to writing their own calculations and make the row level vs. aggregate problem more apparent and even self-solvable. And then as new functions/features are added such as an advanced multi-pass, they can fit within that model too.

• ###### 6. Re: Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

I agree that these would be invaluable enhancements...  I would love more guidance in the UI; its often a guessing game for me.

• ###### 7. Re: Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

I'd love to see an elegant solution to multi-pass aggregation and other features our customers have been requesting.

I hereby join the choir singing "me too!"

• ###### 8. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

A postscript on this one, this Idea: http://community.tableau.com/ideas/1316 could potentially solve this problem for some use cases. I'm thinking that I could set up a filter to get the first/last date per customer customer and then have that as a Set upon which further calculations could be performed.

• ###### 9. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

Another addendum to this thread. There was another thread Re: YOY Comparision where I was trying to solve a related problem, namely identifying the latest month in the data in order for a YOY calc to only be using the same months from prior year as current year, even when the month wasn't a dimension in the view. I came up with another solution involving nested sets, where one Set on the Year uses a top filter to get the latest year, then that is nested inside another Set based on the Month that uses the Year Set in a record-level calc to identify the active months, then that Set could be used as a filter in the view.

This only works when we want a first/last across the entire data set, and since conditional & top filters are computed as subqueries can run into problems, and it's a solution that can get the desired results without having to do any custom SQL.

Jonathan

• ###### 10. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

The following idea could help a lot with these kind of problems -- basically it calls for allowing more control when creating dynamic sets.

http://community.tableau.com/ideas/3601

• ###### 11. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

http://community.tableau.com/ideas/3601

NOTE: Lightbulb will appear next to idea threads

• ###### 12. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

Hi Alex, I like it, and I'm wondering how is that idea different from http://community.tableau.com/ideas/1316?

• ###### 13. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

I think Idea 3601 is more general that 1316, and could help address more

problems than just the case mentioned in 1316 (top). I think if Tableau

implemented 3601, it would also address the case mentioned in 1316.

To be specific, in idea 3601, I suggest allow creating sets with arbitrary

conditions and grouping by some dimensions, not just adding group

by dimensions to topN filters -- and also allowing custom SQL to define the

members of a dynamic set.

Those would let us define dynamic sets that put almost anything legal in

the IN clause of a SQL query -- i.e. where )

Hopefully, adding group by to set conditions would cover most cases -- the

set condition could then be an aggregate calculation used in a HAVING

clause -- but then we could use custom SQL for the others.

On Thu, Jul 31, 2014 at 1:57 PM, Jonathan Drummey <

• ###### 14. Re: Find Earliest/First/Least and/or Latest/Last/Maximum of a Field and Use that for Other Fields

Bookmarking this thread.  I need to study this in detail when time permits.

1 2 Previous Next