1 2 Previous Next 20 Replies Latest reply on Dec 3, 2014 3:47 AM by juliette.mccormack

    Date Addition over multiple fields

    Ben Evans

      Hello experts!

      I have a problem that has me absolutely stumped. Really hoping someone can help me.

       

      I have a database table with the following columns:

      - Customer Name

      - ... various other dimensions ...

      - Part Type

      - In Date

      - Out Date

      - Ship Date

       

      In, out and ship dates can be quite a way apart from each other.

      What I want is a graph (preferably a line chart) that plots the COUNT of each time a field equals a given date, for a user-defined range of dates.

       

      Eg. On 2013-05-01 there were 120 jobs that came in (had in_date = 2013-05-01), 112 jobs that went out, and 122 jobs that shipped.

       

      I believe there's probably the use of calculated fields somewhere, but other than that I'm a little stumped.

       

      Does anyone have any ideas? Any help would be greatly appreciated.

        • 1. Re: Date Addition over multiple fields
          Jim Wahl

          Hi Ben,

           

          It might be as simple as creating a new field: Count Calc =

          IF [In Date] == #2013-05-01# OR [Out Date] == #2013-05-01# OR [Ship Date] == #2013-05-01
          THEN 1
          ELSE 0
          END
          

           

          This will be evaluated for each row and you can then add SUM(Count Calc) to your view.

           

          #2013-05-01# is a Tableau date format. You'd probably replace this with a date type Parameter (when you enter the formula, select add Parameter), which is a user-selected constant you can use in calculated fields. To select a range, you could use two date parameters and use >=  and <=.

           

          Jim

          1 of 1 people found this helpful
          • 2. Re: Date Addition over multiple fields
            Ben Evans

            Thanks Jim.

            I think that's on the right track, but I don't want a sum of all of the dates - I want them all being displayed separately, and need want the date being adjustable.

             

            I'm wanting something like this:

            Screen Shot 2013-09-06 at 7.44.56 AM.png

            Any ideas?

            • 3. Re: Re: Date Addition over multiple fields
              Jason Scarlett

              I've attached an example using the generic Tableau Superstore Data.

              1. Generate a complete list of possible dates (I use some SQL code to generate 20 years of dates) - Base Table
              2. Create as many duplicate Data Sources as you want to count the dates of (I think you need 3 - my example shows 2)
              3. Set the Data Relationship between the Base Table and the first Data table Ship Dates (I set Year/Quarter/Month/Day)
              4. Set the Data Relationship between the Base Table and the second Data table Order Dates (I set Year/Quarter/Month/Day)
              5. Create chart using a count of the number of records (matches based on date relationship), set dual axis
              6. Done

               

              Jason

               

              Custom SQL:

              SELECT      cal_date,

                          TO_CHAR(cal_date, 'yyyy') AS YEAR,

                          TO_CHAR(cal_date, 'mm') AS MONTH,

                          TO_CHAR(cal_date, 'iw') AS week_of_year_ISO,

                          TO_CHAR(cal_date, 'ww') AS week_of_year,

                          TO_CHAR(cal_date, 'dd') AS day_of_month,

                          TO_CHAR(cal_date, 'd') AS day_of_week,

                          TO_CHAR(cal_date, 'ddd') AS day_of_year,

                          TO_CHAR(cal_date, 'hh24') AS hour_of_day

              FROM        (

                          SELECT     TO_DATE('2008-01-01', 'yyyy-mm-dd hh24') - 1/1 + ROWNUM/1 AS cal_date

                          FROM       all_objects

                          WHERE      TO_DATE('2000-01-01', 'yyyy-mm-dd hh24') - 1/1 + ROWNUM/1 < TO_DATE('2020-01-01', 'yyyy-mm-dd hh24')

                          )

              -- just my way of generating dates ... you could build it in excel just as easily.

              1 of 1 people found this helpful
              • 4. Re: Re: Date Addition over multiple fields
                Jim Wahl

                Another approach is to combine the separate date columns into a single date column with another column to show the date type. This can be done with a union statement and custom SQL.

                 

                Using Jason's example, the data looks like this

                Row ID | Order ID | Order Date  |  Ship Date    | additional cols | ...
                1      | 3        | 2012-01-01  | 2012-01-20
                2      | 5        | 2012-01-02  | 2012-01-15
                

                 

                You want this to look lie

                Row ID | Order ID | Date        |  Date Type
                1      | 3        | 2012-01-1   |  Order
                1      | 3        | 2012-01-20  |  Ship
                2      | 5        | 2012-01-02  |  Order
                2      | 5        | 2012-01-15  |  Ship
                
                

                 

                Here is the custom SQL:

                SELECT
                  [Orders$].[Order ID] AS [Order ID],
                  [Orders$].[Row ID] AS [Row ID],
                  [Orders$].[Order Date] AS [Date],
                  "Order" AS [Date Type]
                FROM [Orders$] 
                UNION
                SELECT
                  [Orders$].[Order ID] AS [Order ID],
                  [Orders$].[Row ID] AS [Row ID],
                  [Orders$].[Ship Date] AS [Date],
                  "Ship" AS [Date Type]
                FROM [Orders$] 
                
                

                 

                Your life will be a bit easier if you add all of the columns you want to filter on.

                 

                Then it's just a matter of dragging Date to the Columns shelf, Number of Records to the Rows shelf and Date Type to the color shelf.

                 

                2013-09-06 19-48-44.png

                 

                 

                Jim

                • 5. Re: Re: Date Addition over multiple fields
                  Jason Scarlett

                  I like Jim's method better than mine if all you want is to compare counts (you don't want to compare any other columns in the data across the sales/orders/etc dates).

                  Jason

                  • 6. Re: Date Addition over multiple fields
                    Ben Evans

                    Still good to know how to be able to do that. Thank you!

                    • 7. Re: Date Addition over multiple fields
                      Ben Evans

                      Perfect.

                      Thank you so much.

                      • 8. Re: Re: Date Addition over multiple fields
                        Jim Wahl

                        Just saw another trick today at Tableau conference that I think might work even better.

                         

                        It's a combination of the two approaches above. Basically you create multiple connections to the data source, and change the relationship so that everything is blended on "Order Date", but Order Date is

                        Order Date in the first connection

                        Ship Date in the second,

                        ...

                         

                        Here's what I did to update the attached example:

                         

                        1. Duplicate the data sources by right clicking the data source and selecting duplicate. If you're using a database connection, this creates another independent connection to the database.

                         

                        If you're using extracts, this will duplicate the extract. This works, but a better option is to select Connect to data and select the same extract. In either case rename this extract with a name specific to the modified field (and to prevent confusion, since you'll be changing the names of fields in this data source). I called it Ship Date.

                         

                        2. Select the primary data source Sample - Superstore Sales (Excel) and drag Order Date to the columns shelf. If you right-click when you drag it, you can also select the date aggregation type. I used Exact Date.

                         

                        3. Drag Number of records to the Rows shelf.

                         

                        4. Select the Ship Date data source. You'll see n orange chain link that means Order Date is linked to the primary data source. Right-click Order Date and Rename it to "Original Order Date". Right-click Ship Date and rename it to Order Date. Right-click gain on Order Date > Default Properties > Comment and add a note to yourself, "This is Ship Date."

                         

                        5. Drag Number of Records from the secondary data source to the Rows Shelf. Since this is a green pill, it will create another axis. Click on the axis and drag it to the left-axis above to put both of these lines on the same graph:

                        2013-09-09 03-29-33.png

                         

                        Jim

                        • 9. Re: Re: Date Addition over multiple fields
                          Shawn Wallwork

                          Yo Jim keep those #TCC13 tips coming! Us forum folks far from the action want to know everything. Thanks,

                           

                          --Shawn

                          • 10. Re: Re: Date Addition over multiple fields
                            Jonathan Drummey

                            Hi Jim,

                             

                            I'm struggling to find the right words to describe the wrongness of this kind of self-blend. (I think I'm more bothered that someone from Tableau would be teaching it than anything else). This needs a big ole' caveat that it will *only* return accurate results if there the primary dimension is domain complete for the secondary dimension. In this case, that means there must be a value of Order Date for every value of Ship Date. In the case of Superstore Sales Extract, there are 41 Ship Dates that have no corresponding Order Date, Superstore Subset has 46.

                             

                            Turning on Show Missing Values to pad out the data prior to blending is of limited use here because the padding will only go to the min/max of the primary (Order Date), thereby missing data that is outside of that boundary. The v8 Superstore Sales data has a max Order Date of 12/31/13, the max Ship Date is later into 2014 so those ship dates would not be included in padding.

                             

                            Rather than depending on the primary data to have a set of values, there are  accurate ways to build a scaffold source for a blend, like union'ing two select statements to get the min of order date/ship date and the max of each date, then use Show Missing Values for that. Or one could use the original solution you'd come up with, to transform the data.

                             

                            Jonathan

                            • 11. Re: Re: Re: Date Addition over multiple fields
                              Jim Wahl

                              Hi Jonathan,

                               

                              Thanks for reading and catching the domain padding error above. There was a little voice in the back of my head saying "domain padding", but the I was a bit too tired to pay attention to it.

                               

                              In fairness to Tableau, they did mention the need to turn on Show missing values. In fact they made it quite obvious with an example (shown below), which was a "Jedi Super Store" backlog analysis of orders entered (Order Date)  minus orders shipped (Ship Date).

                               

                              The backlog formula was

                              // Do RUNNING_SUM twice to avoid issue of nulls
                              RUNNING_SUM(SUM([Number of Records])) - RUNNING_SUM(SUM([Ship Date (Jedi Superstore Sales (Excel).xls) (copy)].[Number of Records]))
                              

                               

                              When aggregating this on the Week level, there is at least one record in each data source for each week, and Show Missing Values wasn't necessary. Note how the backlog ends at 0 on the last day. (Nice to have a squeaky-clean Superstore data set. The first and last days for both Order Date and Ship date are the same, too. )

                              2013-09-10 01-15-17.png

                               

                              Then we switched to daily aggregation and because of the nulls, the backlog grows continuously---and it's obvious (at least once you've see it) that Show Missing Values needs to be enabled.

                              2013-09-10 01-17-41.png

                               

                              And with Show Missing Values enabled:

                              2013-09-10 01-18-07.png

                               

                              I've attached the workbook, if anyone is interested in the specifics (and the other Jedi training tricks).

                               

                              You're right a bigger caveat maybe is necessary, and because of the required renaming (which I'm sure will confuse me latter), I'm not sure I'd add self blends to my regular toolkit.

                               

                              Quick question on the scaffold approach with regards to filtering. It seems like there are two options:

                              1. Have a min/max scaffold with dummy dates (for example, #1900-01-01#), and replace these with user-specified filter parameters. The issue is having good default values (ideally the current range).
                              2. Use SQL to find the min and max values over all dates. The issue here is that when you use a date filter, you'll filter out the min / max values in the scaffold.

                               

                              Am I understanding this correctly? Is there a workaround (other than building a scaffold with all of the possible dates)  --- or is a union approach easier.

                               

                              Jim

                              • 12. Re: Re: Re: Date Addition over multiple fields
                                Jonathan Drummey

                                Hi Jim,

                                 

                                I'm going through some old emails this morning and found this unfinished thread, sorry for the delay!

                                 

                                With regard to your statment: "The first and last days for both Order Date and Ship date are the same, too." That's critical. The underlying data has to have enough detail so that the range of Order Date is larger than the range of Ship Date at whatever level of granularity is used in the particular view, which is probably not the case if we drilled down to a particular Department/Category/Ship Mode/etc.

                                 

                                If the users could filter on an arbitrary range of dates, the two options I can think of would be to use a scaffold that had fully padded the domain, or to use a table calc filter for the date.

                                 

                                Jonathan

                                • 13. Re: Re: Re: Date Addition over multiple fields
                                  Jim Wahl

                                  Thanks for the followup Jonathan. I think I'm finally starting to get my head around domain padding.

                                   

                                  Jim

                                  • 14. Re: Re: Re: Date Addition over multiple fields
                                    Jonathan Drummey

                                    Excellent! Can you explain it to me? Seriously, though, almost every time I think I've gotten my head around it then I learn something new that forces me to rethink what I thought I knew. Most recently was in the last week or two, I realized that there were a number of cases where I was assuming domain padding was happening when it was really domain completion that only looked like domain padding because the domain being completed was a bunch of dates that were non-sparse enough to look like they were domain padded.

                                     

                                    Jonathan

                                    1 2 Previous Next