5 Replies Latest reply on Dec 3, 2019 7:38 PM by Paul Wachtler

    Help with comparing dates between two spreadsheets

    Hayley Nehoff

      Thanks in advance for your help!

       

      I am using Tableau version 18.2. I have made a simplified version of the problem I am working on in the packaged workbook.

       

      I have two very large data sets from hospital data (>500,000 patients) where patients are uniquely identified by an encrypted NHI. My goal with these data sets is to be able to identify patients who have a kidney injury following treatment with a drug (Ibuprofen) that is known to cause kidney injury in some instances. To do this I have access to two data sets.

      -One spreadsheet contains all of the medications prescribed to hospital inpatients over a 6 month period, organized by encrypted NHI. The date that they were prescribed is in dtm format.

      -One spreadsheet contains the results of all of the kidney function tests (eGFR test) ordered over an 18 month period (starting 6 months before the medication prescription data and ending 6 months after the medication prescription data), organized by encrypted NHI. The date that the test was reported is in dtm format.

      I have joined these spreadsheets using an inner join with encrypted NHIs. The spreadsheets themselves are unable to be modified.

       

      I have used a calculated field to split the date that the kidney function test (eGFR test) was reported into before and after the date that medications were prescribed as below

      IF [eGFR test reported dtm]<[(Prescription Created Dtm)]

      THEN "Before Ibuprofen"

      ELSEIF [eGFR test reported dtm]>[Encrypted NHI (Sheet11)]

      THEN "After Ibuprofen"

      END

       

      I used a filter on the graph so that it only displays data from patients that were prescribed ibuprofen and not any other drug. This has worked perfectly to visualize what the before and after values were for individual patients and on average but now it is time for the next step.

       

      I need to exclude patients that don’t have BOTH a “Before Ibuprofen” and an “After Ibuprofen” kidney function test (eGFR test) because without both sets of data I cannot identify a kidney injury accurately.

       

      I tried a range of calculations including

      IF [eGFR test reported dtm]<[Prescription created] AND [eGFR test reported dtm]>[Prescription created]

      THEN "Full data set"

      ELSE "incomplete data set"

      END

      This calculation is valid but it only gives me the option of filtering to “incomplete data set” which is not all that helpful.

       

      How can I filter out patients that don't have both before ibuprofen and after ibuprofen values?

        • 1. Re: Help with comparing dates between two spreadsheets
          Paul Wachtler

          Hi Hayley,

           

          That calculation is only returning "incomplete data set" because for any one [eGFR test reported dtm] value it is either before or after the prescription is created but not both.  Which field do you use as your patient ID?

           

          Also, based on what you have now, what do you expect it to return?  I see four rows (not sure if theyr'e patients) but each row looks like it has before and after ibuprofen values based on the colors.

           

          Screen Shot 2019-12-02 at 8.07.56 PM.png

           

          Best,

          Paul

          • 2. Re: Help with comparing dates between two spreadsheets
            Hayley Nehoff

            Hi Paul,

             

            Sorry for the poor description!

             

            So the four rows that you are seeing (AAG20916, AAS9165, ABJ729, ABZ18972) are the patients encrypted NHIs. AAG20916 only has one eGFR after they were given Ibuprofen (i.e. no eGFR before ibuprofen is available) so I want a calculation that knows to exclude that patient but not the other patients who all have eGFRs before and after they were given the drug.

             

            OK so the calculation isn't working because it can't be simultaneously before and after, that makes sense! thank you!

             

            How might I write a calculation that knows that I want it to look for before and after within the same NHI?

             

            Thanks for your help,

             

            Hayley

            • 3. Re: Help with comparing dates between two spreadsheets
              Paul Wachtler

              Thanks for the clarification Hayley.  That all makes sense.  I got this working using an LOD calculation.  Since I'm using a FIXED LOD, I changed your Prescribable Item Name filter to a context filter.  This makes sure that this is processed before the LOD is calculated.

               

              {FIXED [Encrypted NHI]: MAX([Before/after ibuprofen] = "Before Ibuprofen")}
              AND
              {FIXED [Encrypted NHI]: MAX([Before/after ibuprofen] = "After Ibuprofen")}
              

               

               

              This will return true only if both the "Before Ibuprofen" and "After Ibuprofen" values exist for a given Encrypted NHI.  Again, since the Prescribable Item Name field is a context filter, it will be processed before the LOD.  I add the NHI filter to the viz in the below screenshot and the attached workbook so you can see that it's marking the correct values as True and False.

               

              Screen Shot 2019-12-02 at 11.24.30 PM.png

              1 of 1 people found this helpful
              • 4. Re: Help with comparing dates between two spreadsheets
                Hayley Nehoff

                Thank you so much for this Paul! That works perfectly!

                 

                I am clearly going to have to look into the LOD calculations more, I wasn't trying to approach it from this angle.

                 

                Thanks again, I really appreciate your expertise!

                • 5. Re: Help with comparing dates between two spreadsheets
                  Paul Wachtler

                  You're welcome!  Happy to help!