Thanks in advance for your help!
I am using Tableau version 18.2. I have made a simplified version of the problem I am working on in the packaged workbook.
I have two very large data sets from hospital data (>500,000 patients) where patients are uniquely identified by an encrypted NHI. My goal with these data sets is to be able to identify patients who have a kidney injury following treatment with a drug (Ibuprofen) that is known to cause kidney injury in some instances. To do this I have access to two data sets.
-One spreadsheet contains all of the medications prescribed to hospital inpatients over a 6 month period, organized by encrypted NHI. The date that they were prescribed is in dtm format.
-One spreadsheet contains the results of all of the kidney function tests (eGFR test) ordered over an 18 month period (starting 6 months before the medication prescription data and ending 6 months after the medication prescription data), organized by encrypted NHI. The date that the test was reported is in dtm format.
I have joined these spreadsheets using an inner join with encrypted NHIs. The spreadsheets themselves are unable to be modified.
I have used a calculated field to split the date that the kidney function test (eGFR test) was reported into before and after the date that medications were prescribed as below
IF [eGFR test reported dtm]<[(Prescription Created Dtm)]
THEN "Before Ibuprofen"
ELSEIF [eGFR test reported dtm]>[Encrypted NHI (Sheet11)]
THEN "After Ibuprofen"
I used a filter on the graph so that it only displays data from patients that were prescribed ibuprofen and not any other drug. This has worked perfectly to visualize what the before and after values were for individual patients and on average but now it is time for the next step.
I need to exclude patients that don’t have BOTH a “Before Ibuprofen” and an “After Ibuprofen” kidney function test (eGFR test) because without both sets of data I cannot identify a kidney injury accurately.
I tried a range of calculations including
IF [eGFR test reported dtm]<[Prescription created] AND [eGFR test reported dtm]>[Prescription created]
THEN "Full data set"
ELSE "incomplete data set"
This calculation is valid but it only gives me the option of filtering to “incomplete data set” which is not all that helpful.
How can I filter out patients that don't have both before ibuprofen and after ibuprofen values?
Example question - workbook.twbx 62.6 KB