7 Replies Latest reply on Aug 16, 2018 9:29 AM by Joanna Peng

    Mis-Adding and Multiplying Data Points

    Joanna Peng

      Tableau Community,

       

      Hi! My name is Joanna, and I'm a tableau intern at Fred Hutch Cancer Research Center. I specifically work for HVTN (HIV Vaccine Trials Network.) HVTN runs various clinical trials/studies which we call protocols. There are different clinics around the world in which we perform these studies. These clinics are often called Sites. Each patient is given something called a Ptid (standing for patient ID.) And these patients all have an enroll date (the date they were enrolled (a dimension called Enrolldt)). My project for HVTN is to create tables that give information about the patient enrollment for different sites and protocols. There are two main data points in my tables; the first being the projected/expected number of patients that SHOULD or WILL enroll (the dimension is called Enrollment Projected For Week), and the second is the number of patients actually enrolled (we calculate this by using COUNTD of Patient ID.)

       

      If I create a table including the "Week of  Enrolldt" along with a COUNTD of No. (#) Expected Sun Sat (which is the same thing as "Enrollment Projected For Week") and COUNTD of Ptid... then I get a table containing what looks like accurate numbers (Please view Attached Picture #2.) However, if I remove the Enrolldt Dimension, then the Projected Enrollment numbers become obscured and incorrect... but the Number of actually enrolled Patients (COUNTD Ptid) remains accurate(view attached pictures 3, 4, and 5.) Somehow, Tableau is multiplying the actual Projected Enrollment and adding it up to create a really large "Projected Enrollment" that is clearly inaccurate.

       

      Also, if you notice in the 1st attached photo, The Projected enrollment numbers refuse to total correctly. However, to the right, the Distinct count of Patient ID/Ptid Grand Totals correctly. Why isn't Tableau adding and totaling the column of numbers correctly?

       

      The Workbook attached is Tableau desktop version 2018.2

        • 1. Re: Mis-Adding and Multiplying Data Points
          Shinichiro Murakami

          HI Joanna

           

          What is the formula for "No of Expected Patients"?

          Couldn't find in the workbook.

           

          Regards,

          Shin

          • 2. Re: Mis-Adding and Multiplying Data Points
            Don Wise

            Hi Joanna,

            See if the attached and below works for you?  Thx, Don

             

            Screen Shot 2018-08-14 at 4.52.32 PM.png

            Screen Shot 2018-08-14 at 4.52.35 PM.png

            Screen Shot 2018-08-14 at 4.52.42 PM.png

            • 3. Re: Mis-Adding and Multiplying Data Points
              Joanna Peng

              Not sure what you mean by formula... But, No. Expected Sun Sat is a measure that contains the number of patients we expect to enroll within the week. I'm not the one who creates the database, but I can obtain more details about the measure (No. Expected Sun Sat)  if that would be beneficial.

              • 4. Re: Mis-Adding and Multiplying Data Points
                Joanna Peng

                The No. Expected numbers look more accurate, however, I'm certain that Tableau is still multiplying some data which adds up to an enlarged number. Protocols 704 and 703 are pretty much done, so The number of enrolled patients (COUNTD Ptid) should be relatively close to the projected numbers (No. Expected Sun Sat).

                • 5. Re: Mis-Adding and Multiplying Data Points
                  Joanna Peng

                  Although I am curious as to how your grand total was working and why mine wasn't adding the values up...

                  • 6. Re: Mis-Adding and Multiplying Data Points
                    Don Wise

                    Hi Joanna,

                    To help bring down the numbers in the data set, we can use what are known as Level of Detail calculations.  This is what the calculation looks like and then comparatively, the results to the prior set of data:

                    Screen Shot 2018-08-15 at 9.22.18 AM.png

                    With that LOD, it is looking at each row of data and where it finds an Expected Enrollment value compared to Enrolled Date, PTID, Protocol and Site Name, it then this particular calc COUNTS a result.  In most cases, Automatic is just fine, but sometimes helps to go to and tell Tableau (Analysis - Totals - Total Using) how to Total.  Where there's a mix of Counts and Sums, then Automatic should be used.

                     

                    But, the more that I look at the underlying data, I'm thinking the underlying issue is how the Projected Enrollments are being attributed to each patient?  I'll give you an example.  I dove into the weeds...  This is a view for just one patient in Seattle and one in Cape Town.

                     

                    Seattle:

                    Screen Shot 2018-08-15 at 10.11.27 AM.png

                    Cape Town:

                    Screen Shot 2018-08-15 at 10.15.17 AM.png

                    Please note the column, which is the supplied data of Enrollment Projected for Week vs. my calculation which is COUNTING the number of instances found not SUMMING. So while my calculation may have brought the numbers down to a more reasonable amount, I don't think it's accurate.  The action should be a SUM, which will bring your numbers way back up again.  Why? 

                     

                    If we look at patient in Seattle and the Enrollment Projected for Week values are what I would expect to see for that type of program.  1s and 2's projected per week.  For the patient in Cape Town we see 19's and 25's etc., which are probably data entry errors where that number is probably for ALL patients for that week, not just that single patient...this is not something we can correct very well in Tableau.  Unless there's a possible maximum to artificially limit those numbers, then a calculation could be written to exclude any data attributed to a patient id for Enrollment Projected for Week values > than 4 or 5?  I'm just brainstorming here.  You'll know your data better than me.

                     

                    Let me know your thoughts?  Thx, Don

                    • 7. Re: Mis-Adding and Multiplying Data Points
                      Joanna Peng

                      Don,

                       

                      Thanks for taking the time to fiddle with our data.We suspected that the format of our data base was causing the problems. So, we reformatted and its working great. Thanks again for your input.

                       

                      Joanna