9 Replies Latest reply on Aug 24, 2016 9:36 AM by Yuriy Fal

    Using two date fields to create trend across time

    Jared Cunningham

      Hi - Tableau newbie here..

       

      I want to create a trend across time of an hourly cost measure.

      I have three data fields in use: 'Unit Cost', 'Hire date' and 'Exit date'

      The Unit cost field is a measure of cost.

      The other two take the format of a specific date sometime in the last few years.

       

      I know how i would do this in excel.. I would have a cell with each month in it and then create a formula under each month to calculate the average unit cost in each month, then i would trend that with a graph.

       

      I'm a bit stuck how i translate this into tableau..

       

      I feel like i need to have a measure or field for each month, but that isn't a viable option within the data.

       

      I can get a trend out of tableau but it only plots the data based on the hire date or exit date fields.. i need it to recognize that because the person doesn't have an exit date, that data point should be plotted in the current month. Or that because the person was exited in February, they shouldn't appear in March.

       

      Is there any way of generating this sort of trend across time, using only the hire and exit date field??

       

      Hope that makes sense..


      i have attached my source excel file with some dummy data as a reference..

        • 1. Re: Using two date fields to create trend across time
          Stoyko Kostov

          Hi Jared,

           

          I did this using a self-join (if I understood your question correctly).

           

          Self-join is needed so you can use Hire Date in one of the tables as your pivot, and the other table to select the employees who have been hired at the time of the pivot.

           

          To get self-join in Tableau, you first need to add a constant dummy field in your data. I added a column called Dummy in your Excel spreadsheet and populated it with 1.

           

          Now you can create the Tableau viz as follows:

           

          1. Connect Tableau to the Excel file and drag Resources to the data source pane twice. Tableau will ask you what to join on - choose Dummy = (Resource 1) Dummy.

           

          2. Go to Sheet 1, and create a calculated field as follows:

          IF([Hire Date (Resources1)] <= [Hire Date] AND ([Exit Date (Resources1)] >= [Hire Date] OR ISNULL([Exit Date (Resources1)])))

          THEN [P&L Cost (Resources1)]

          ELSE 0

          END

           

          I named it AvgPay. You see that I compare employees from the second copy (Resources1) to the hire date in the pivot. Note also how I allow for employees who haven't exited by checking for NULL.

           

          3. Right-click on the calculated field, and choose Default Properties - Aggregation - Average.

           

          4. Drag [Hire Date] from the first table (Resouces) to Columns - this will be your pivot. Defaults to year - right-click to choose Month. You have 2 options; the first one will ignore the year, and the second one will include the year.

           

          5. Drag the new calculated field AvgPay to Rows.

           

          Hope this helps. I'm attaching the packaged workbook I produced using Tableau 9.3.

           

          If I haven't fully understood your question, and the guidance above doesn't help you solve it, don't hesitate to ask again.

          1 of 1 people found this helpful
          • 2. Re: Using two date fields to create trend across time
            Jared Cunningham

            Thanks for the response! It looks like structurally you've done what i wanted I just don't think the numbers are correct.. As a guide, the number should be in the region of 85-95 for each month (approximately).

             

            So a couple of questions:

             

            1. What is the purpose of the dummy column to join the data? If we were joining the data, would we not want a unique identifier to join the data? Which leads into my second question..

             

            2. In the data source, the AvgPay field doesn't appear to be correct. It should be bringing through the P&L Cost number, but instead it's bringing through either zero or 204.375. It doesn't look like the data has been joined correctly?

            • 3. Re: Using two date fields to create trend across time
              Stoyko Kostov

              Hi Jared,

               

              Apologies - forgot to add a filter for AvgPay. Without it, all employees are aggregated - even though the ones outside of the hire date range have their pay calculated as 0. Hence the wrong number.

               

              To correct, drag AvgPay to the filters pane, choose "At Least", and enter 0.01. That way the employees outside the hire date range won't be considered in the calculation by definition - their pay was set to 0.

               

              To your other question - the purpose of the dummy column is as a workaround to the cartesian product. As Tableau requires a field to join, to model a full cartesian product (which doesn't join on any field), you need to do an inner join on a dummy column.

               

              If this still is not clear, consider the following small example: you want the cartesian product of the following small tables: table A with single column "a" and rows 1 and 2, and table B with single column "b" and rows 3 and 4. The cartesian product will be table C with 2 columns "a" and "b" and the following rows: (1, 3), (1, 4), (2, 3), (2, 4).

               

              To achieve this in Tableau, we add column "d" (for dummy) to A and set it to a constant (1), and ditto for B. Therefore, A becomes (1, 1), (2, 1), and B becomes (3, 1), (4, 1). Now we can do an inner join on A.d = B.d and will get precisely the cartesian product described above.

               

              Let me know if you have more questions.

              • 4. Re: Using two date fields to create trend across time
                Jared Cunningham

                That doesn't appear to work? When i add a filter for AvgPay, with an 'at least' value of 0.01, the whole graph disappears?

                • 5. Re: Using two date fields to create trend across time
                  Stoyko Kostov

                  Attaching the workbook I produced.

                  1 of 1 people found this helpful
                  • 6. Re: Using two date fields to create trend across time
                    Jared Cunningham

                    Oh i see... I was bringing AvgPay into the filter, as an attribute..

                     

                    When i select all values and then filter on <0.01 it works.

                     

                    Thanks heaps!!!


                    Howcome setting it up as an attribute doesn't work?

                    • 7. Re: Using two date fields to create trend across time
                      Stoyko Kostov

                      I think attribute requires constant value for all items we are aggregating (averaging), which is not the case here.

                      • 8. Re: Using two date fields to create trend across time
                        Alexander Mou

                        I have written on similar topics

                        Vizible Difference: Taking Stock with Start and End Dates

                        Vizible Difference: Counting Active Orders via Pivot

                         

                        It might help solving your problems.

                        1 of 1 people found this helpful
                        • 9. Re: Using two date fields to create trend across time
                          Yuriy Fal

                          Hi all,

                           

                          Adding a Pivot approach mentioned above.

                          Please find the attached.

                           

                          Yours,

                          Yuri

                           

                          PS Cartesian also works, but two caveats:

                           

                          -- [Hire Date] doesn't contain all dates

                          between MIN & MAX (it's not domain complete).

                          This is not a problem with MONTH aggregation, though.

                           

                          -- The resulting number of records are at the 1M range.

                          This is not a problem with an extract in place, though.

                          1 of 1 people found this helpful