11 Replies Latest reply on Feb 8, 2019 10:58 AM by Dan Cory

    Grouping/visualizing by dates where rows don't have a "date stamp"

    Eric Sammons

      Hi,

       

      I have a dataset that was created by our business intelligence team.  When they pull they capture the data they are not including a datestamp column, instead, they are using start_dttm and end_dttm and applying sql to create views.

       

      I'd like to create a visualization using the data at hand, without having to go to the view; in my specific use case a working view isn't available yet.

       

      The sql is essentially doing the following:

       

      if date >= status.start_dttm
      AND date < status.end_dttm
      AND issue_created_date <= date
      

       

      I have attempted to create a calculated field as follows:

      COUNTD(IF CONTAINS("New", [Status])
      THEN [Bug Id]
      END)
      

       

      Which clearly does not cover all the above conditions, however, when I am able to see, when using JQL (in Jira) to confirm results that my numbers are way off.

      Comparable JQL to validate:

      project in (PROJ1, PROJ2, ... ) and status = New
      

       

      The above will show me the current items in New as of now.  When I compare this to my full dataset (600,000 rows) I end up with ~340 items in January 2019 followed by a December 9999 date with 2,400 rows, which nearly matches the JSQL.    December 9999 reflects those rows that do not have a end_dttm, end_dttm reflects the end date that a row was in a state, i.e. the status changed.

       

      What I hope to accomplish is:

      • Eliminate the December 9999 column, it should not be part of the viz. 
      • Proper numbers for items in New during/by month/year with the ability to filter at a monthly or weekly view.
      • I will add a filter by project option which should have little to no consequence with solving the original problem.

       

      I'm currently struggling with how to leverage the fields provided, start_dttm and end_dttm, to accomplish what the BI team accomplishes with SQL.

        • 1. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
          Don Wise

          Hi Eric,

          I think you will probably need to 'scaffold' a set of continuous dates in order to get the below visualization.  In the attached, I revised your workbook to bring in a set of continuous dates from 1/1/2016 through 12/31/2017 as a separate MS-Excel worksheet.  That data is joined to your data using a simple 1:1 join. The join on the left is a '1' and same on the right '1' (uses 'add new join clause') and set to INNER join to reduce down the number of rows between your data and the continuous dates data.  A True/False filter set to True further brings that down and that's using the following calc:

           

          [Date]>=[Start Dttm]

          AND

          [Date]<=[End Dttm]

           

          Which is essentially doing what the SQL was doing prior. Instead of using either your Start or End Date on Columns, use the new DATE field from continuous dates data to fill in the missing dates and get an accurate count using your New Backlog Calc.  Hope it helps!  Thx, Don

           

          Screen Shot 2019-01-30 at 12.24.22 PM.png

          • 2. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
            Eric Sammons

            Thank you for your response.  This looks promising but I need to translate it to what I have available in our Database.  I do have a table available that appears to be what the BI team uses to do essentially what you are doing.  However, where your example uses one excel sheet with 1 column of dates, the table in question is set up to have calendar_dates, calendar_end_of_month_date, calendar_last_day_of_quarter, etc...

             

            So my question is then, what would my inner join be here.  Not sure a calculated join would work in the same way with this data structure.

            dm_date_did

            calendar_date

            calendar_first_day_of_quarter

            calendar_last_day_of_quarter

            calendar_month_end_date

            calendar_month_start_date

            calendar_week_end_date

            calendar_week_start_date

            calendar_day_of_year_number

            calendar_month_number

            calendar_month_name

            calendar_quarter_number

            calendar_week_of_fiscal_quarter

            calendar_week_of_year_number

            calendar_year_end

            calendar_year_number

            calendar_year_name

            Thanks!

            • 3. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
              Don Wise

              I would think Calendar_Date if it's continuous without any breaks...Thx Don

              • 4. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
                Eric Sammons

                right, but where you have 1 = 1, am I then in this situation going to use end_dttm = calendar_date?  This is what I'm attempting to do now.

                • 5. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
                  Don Wise

                  Hi Eric,

                  Unless I'm not understanding your environment here, you would ADD that Calendar_Date as an additional data source then do the join clause as 1 for your original data and 1 for the calendar_date data to create the scaffold.  Essentially taking two separate tables of data and joining them. Are you not able to use the Drop down and select 'Edit Join Calculation' like below?

                  Screen Shot 2019-01-30 at 1.33.24 PM.png

                   

                  But perhaps what I've described won't work with your environment/DB. Thx, Don

                  • 6. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
                    Eric Sammons

                    So calendar_date is one column in a much larger table.  So to get closer to what I believe you are suggesitng I did a custom sql,

                    select calendar_date from dm_date_d

                    From here I created the inner join and did edit join calculation, 1 = 1. 

                    Screen Shot 2019-01-30 at 1.55.18 PM.png

                    However, I now get a tableau error.

                     

                    ERROR: TEIID30068 The function 'strpos(substring('New', 1), CAST(jira_bug_stts.status AS VARCHAR))' is an unknown form. Check that the function name and number of arguments is correct.; Error while preparing parameters
                    • 7. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
                      Don Wise

                      Hi Eric,

                      Yes, I'm not familiar with that environ or DB. So...perhaps the following might work better if possible?

                       

                      First line of join:

                      Use Start Dttm from Data Source but instead of the = sign use   >=  for the join condition and on the Right Side of the Join use your Calendar_Date

                       

                      Then once more on the next line down:

                      Use End Dttm but instead of the = sign use the <= for the join condition and on the Right Side use your Calendar_Date.

                       

                      That should also mimic the original filter calculation.  So if no error thereafter, then you wouldn't need that calc afterwards.  But instead of using either Start Dttm or End Dttm, you'd use your Calendar_Date on Columns thereafter.

                       

                      That's best I can think of...Thx, Don

                      • 8. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
                        Eric Sammons

                        I appreciate all the help and I believe I understand what we're trying to do here.  I am still running into issues though, with either approach.  I wonder if the issue is related to start_dttm and end_dttm being in another table.

                         

                        Currently, things look a bit like this:

                         

                        SCHEMA_1.jira_bug_mtrcs : SCHEMA_1.jira_bug_kywrds_and_rstln

                                                                    : SCHEMA_1.jira_bug_stts

                                                                    : SCHEMA_1.jira_bug_type

                                                                    : SCHEMA_2.Custom_SQL_QUERY

                         

                        All joins for SCHEMA_1 tables are on bug_source_id while the SCHEMA_2 join is currently set to 1:1 with jira_bug_mtrcs; however, is it possible that the problem may be related to the fact that the *_dttm columns are in jira_bug_stts and not in jira_bug_mtrcs.

                        • 9. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
                          Don Wise

                          Hi Eric,

                          That's the struggle with trying to answer things on the forums where the environment is different with each user.  Conceptually I think you understand where you need to get to, but the challenge is always applying the concept to actual data and structure.  Back to what I recently thought about, is to forget about the 1:1 join and simply join on common date fields, your Calendar_Date in whatever table that field lives in to both Start_Dttm and End_Dttm in bug_stts. Where Start_Dttm is >= Calendar_Date and End-Dttm is =< Calendar_Date.

                           

                          It may take a number of tries before getting there but the data types need to be the same (date).  Thx, Don

                          • 10. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
                            Dan Cory

                            Don is right you should just be able to do the join on Start_Dttm and End_Dttm, both to Calendar_Date.

                            It looks from your error message that you are using TEIID, which is not a Tableau-supported connector. The error message suggests a problem with the STRPOS function, usually FIND in Tableau. I don't think that is caused by anything being described here, but instead by something else in your workbook.

                            The STRPOS function is probably fixable but you'll need to tell us more about how you connected and probably find the full SQL statement that is causing the error from the logs.

                             

                            Dan

                            • 11. Re: Grouping/visualizing by dates where rows don't have a "date stamp"
                              Dan Cory

                              Sorry, it's actually the CONTAINS that you mention that is the problem:

                              CONTAINS("New", [Status])

                               

                              That's getting translated to SQL incorrectly because Tableau doesn't know what version of SQL your database allows.

                               

                              Dan