8 Replies Latest reply on Aug 4, 2015 7:39 AM by Tom Thomas

    How do I filter dates on the MIN of the last updated date, in aggregate?

    Andreas Linden

      Hi,

       

      I'd like to filter dates by the MIN of the last updated date. The last updated date is different from one row to the next:

       

      DateColourLast updated dateRevenues
      01/01/2014Red01/01/2014100
      01/01/2014Red02/01/2014200
      02/01/2014Red02/01/2014120
      01/01/2014Blue02/01/2014300
      02/01/2014Blue02/01/2014100
      03/01/2014Blue03/01/2014114

       

      In the above table I would like only the dates in the 'Date' column that are equal to or older than the minimum last updated date to be returned, i.e.

       

      MAX(Date) <= MIN(Last updated date)

       

      The maximum date should end up as 01/01/2014 using the above logic - revenues are 100.

       

      The problem I'm having is that Tableau evaluates the above formula, when used as a date filter, on each row, which doesn't give me quite the result I'm after.

       

      Any idea how I can resolve this?

       

      Many thanks,

      Andy

        • 1. Re: How do I filter dates on the MIN of the last updated date, in aggregate?
          Justin Emerick

          Andreas,

           

          I attempted to build you what youa re looking for and I'll admit, i'm still confused. I think the right direction is to use SETs - create a set on Date that uses a formula like the one you have above. That should help you get close to the solution.

           

          Best,

          Justin

          • 2. Re: How do I filter dates on the MIN of the last updated date, in aggregate?
            Seth Gundy

            This looks like a job for Table Calculations.  I don't have access to Tableau right now, but off the top of my head, you could try MAX([Date]) <= WINDOW_MIN(MIN([Last updated date]))

            • 3. Re: How do I filter dates on the MIN of the last updated date, in aggregate?
              Andreas Linden

              Thanks Justin - when MAX(Date) <= MIN(Last updated date) is used as part of a condition in Sets the formula is evaluated line by line, rather than looking at the data set as a whole

              • 4. Re: How do I filter dates on the MIN of the last updated date, in aggregate?
                Andreas Linden

                Seth, unfortunately WINDOW functions including WINDOW_MIN aren't available when filtering dates, or when creating a set.

                • 5. Re: How do I filter dates on the MIN of the last updated date, in aggregate?
                  Jim Wahl

                  I'm not 100% clear on your goal.

                   

                  I would like only the dates in the 'Date' column that are equal to or older than the minimum last updated date to be returned, i.e.

                  Do you want to find the MIN(Last update date) across all rows or for each Color or for each Color and Date?

                   

                  In your example, what rows should be kept?

                  • 6. Re: How do I filter dates on the MIN of the last updated date, in aggregate?
                    Andreas Linden

                    Hi Jim - across all rows. So the MIN(Last updated date) would be 01/01/2014 in the example above. I then want to evaluate this date against the 'Date' in each row, i.e. MAX(Date) <= 01/01/2014 - in the example I'd just be left with the first row.

                    • 7. Re: How do I filter dates on the MIN of the last updated date, in aggregate?
                      Jim Wahl

                      I'm still not sure I understand completely. You said, "in the example, I'd be left with the first row."  Since there are three rows in your example with the date 1/1/2014, don't you want the result to include three rows?

                       

                       

                      But, regardless, there are three different, common techniques to find the min date.

                       

                      1. Table Calc

                      The first is to use a table calculation TOTAL(MIN([Last updated date])) -- OR WINDOW_MIN(MIN(..)).

                       

                      TOTAL() is similar to WINDOW_MIN() -- with a subtle difference in that TOTAL() is an aggregate at a different level of detail, which is what you want here. WINDOW_XXX are aggregates of aggregates. In your example, I think both will return the same answer, but TOTAL() is I think cleaner.

                       

                      Filter For Dates 1 =

                      ATTR(Date) <= TOTAL(MIN([Last updated date]))

                       

                      Two notes here. First Date needs to be an aggregate, since TOTAL(MIN(...)) is an aggregate. You can't use Date <= ...

                       

                      Second, I'm using ATTR(Date). Since Date is in the view (the view has date level of detail), MIN(Date) == MAX(Date) == AVG(Date) == ATTR(Date).

                       

                      ATTR() is a special Tableau aggregation that returns the value, Date, when there is only one value and returns * if there is more than one value. So there is no difference here between using MAX(Date) and ATTR(Date). However, with ATTR() you're more likely to see an error if you change this view later and forget that Date has to be in the view.

                       

                      Finally, compute using -- the most important and confusing part of table calculations. Based on your view above with multiple dimensions in the Rows shelf, when you add this field to the view, Tableau will set the addressing (aka Compute Using) to Table Down, which means it will calculate the TOTAL(...) over all dimensions on the Rows shelf.

                      2014-10-05_11-35-39.png

                       

                      I find this bit confusing and prefer to click on the pill in the view > Edit Table Calculation > Compute Using > Advanced which brings up a dialog box where you can explicitly set the partitioning and addressing. Since in this case you want to calculate the MIN(Last updated date) over all rows, all of the dimensions should be on the right-hand addressing side of the dialog.

                      2014-10-05_11-00-45.png

                       


                      2. Duplicated Data Source

                      The problem with the above is that if you apply filters to, say the color red, your min date will based on only the blue rows, so 2/1/2014. Maybe this is what you want?

                       

                      If, however, you want the min date across all rows regardless of what's filtered in the view, then using a duplicated data source and a blend is a good solution.

                       

                      Right-click the data source > Duplicate. If you have an existing view open, when you click this data source, you'll see links on all of the dimensions in the view. What you want is to break these links so that the secondary data source is not affected by filters in the primary.

                      2014-10-05_11-24-19.png

                       

                      Drag Last updated date from the secondary data source to the view. You'll get an error that says, "can blend because there is no relationship." Ignore this, as this is exactly what we want (the min date, unaffected by any relationship).

                       

                      Click the date pill > Exact Date. Click again > Discrete. (All the time ignoring the warnings, if you haven't checked the box don't show me this again.)

                       

                      Filter For Dates 2 =

                      ATTR(Date) <= MIN([Clipboard_20141005T104859 (copy)].[Last updated date])

                       

                      Below shows your data with Red rows filtered. The red box is the filter based on table calcuations. The blue is the blend, which is not affected by the filter.

                      2014-10-05_11-31-17.png

                       

                      You can, of course, add connections / links if you want the min last updated date to be filtered by certain dimensions in the primary data source -- it's not all or nothing.

                       

                      The issue with the duplicated data source is that you're duplicating the data source. If this is a live connection, then Tableau will just create another connection to the database. No problem. If it's an extract, then Tableau will create another connection to the extract until you modify the data source (adding another field, ...) at which point it'll duplicate the extract so you have two copies.

                       

                      You can manage this be either re-duplicating the data source after everything is final or you can create a new data connection that has just the fields you need: the last updated date and whatever other dimensions you want to apply as filters.

                       

                       

                      3. Custom SQL

                      The third approach, which is more common for cohort analysis where you want to compare an event date to the first date for that customer or group of customers, is to either use Custom SQL add that last date column to every row. You can also create a view in your database that does this.

                       

                      This solution will perform better than the other two and after you have the view setup and reduces the complexity in the view. On the other hand, it requires some setup and doesn't allow you to then select a different last updated date based on different filters or criteria.

                       

                      I don't think this approach is necessary here, but there a good description of how to set it up in Tableau's cohort analysis example:

                      Tips for Cohort Analysis | Tableau Software

                       

                      I've attached a workbook with you example data and the calculations I've included above. If this still doesn't quite get you what you need, maybe you can attached your own workbook that shows the issue.

                       

                      Jim

                      • 8. Re: How do I filter dates on the MIN of the last updated date, in aggregate?
                        Tom Thomas

                        Thanks Seth! I was looking to show mark labels for the first and last mark of a dynamic line, this works perfectly!