8 Replies Latest reply on Jul 21, 2016 11:22 AM by Sam R

    Comparing values in different columns and excluding those that match

    Sam R

      HI All,

       

      I am looking for some guidance on the below issue:

       

      i have 5 different columns (CED, CENTERSTONE...). i am looking for a way to exclude the entry where the value matches accross the employee no.

       

      Example ; for the 1st employee 000002 : each of columns have SUNNYVALE which matched and i want to exclude it.

       

      The formula i have tried is :

       

      IF ([CED] = [CenterStone]

      and [CenterStone] = [OIM]

      and [OIM] = [PPLSOFT]

      and [PPLSOFT]=[PROWATCH] )

      THEN 'EXCLUDE'

      ELSE 'INCLUDE'

      END

       

      This is not working as the value is coming INCLUDE for all. any suggestions?

       

      Capture.JPG

        • 1. Re: Comparing values in different columns and excluding those that match
          Tom W

          Hi Sumeet,

          1. You need to attach a Tableau Packaged Workbook which includes your data extracted. You've attached a packaged workbook but the source is still live, thus I cannot open it.

          2. I wouldn't use your real data here as it seems sensitive. You'd be better to create a scenario or extract a subset which has been cleansed.

           

          I'm betting your problem is a result of the exact contents not matching. I bet one of those columns has blank / white space after the value. A way I normally troubleshoot these types of issues is to build out the conditional statement in smaller pieces. I.e. put your first comparison in the IF statement for CED = CenterStone only. DOes that give you the result you want? Then continue adding fields until it errors.

          I think you will find the answer is to compare like IF TRIM([CED]) = TRIM([CenterStone]) ..... to fix the whitespace problem.

          1 of 1 people found this helpful
          • 2. Re: Comparing values in different columns and excluding those that match
            Dmitry Chirkov

            +1

            Just decompose your calculation and see which one (or many) of comparisons return FALSE.

            • 3. Re: Comparing values in different columns and excluding those that match
              Sam R

              Hi Tom,

               

              Thanks for the note. i did try the trim but same results. it seems for some reason its not picking up the string value.

              • 4. Re: Comparing values in different columns and excluding those that match
                Dmitry Chirkov

                Where is your data coming from? (seems like you deleted attached workbook)

                 

                It does work with mock data I constructed so it's all about small diffs in the data (extra spacing, collations etc).

                 

                • 5. Re: Comparing values in different columns and excluding those that match
                  Tom W

                  sumeet ruiwale wrote:

                   

                  Hi Tom,

                   

                  Thanks for the note. i did try the trim but same results. it seems for some reason its not picking up the string value.

                  Break your calculation down into pairs to see which pair of fields isn't returning true.

                   

                  Looking again at your screenshot, it doesn't look your report is using individual fields like 'CED' it looks like your report is referencing a field called Source Sys which has the dimension values of CED, Center Stone etc.
                  Can you share the definition for the fields you've setup i.e. [CED]?

                  • 6. Re: Comparing values in different columns and excluding those that match
                    Sam R

                    Hi Tom,

                     

                    Yes your correct it is one field source system which has values for CED, Centerstone etc. i do have individual columns as well but the alignment does not work correctly for each employee when i do that.

                    CED :

                    IF [Source Sys] = 'CED' then [Derived Emp Location] ELSE ' ' END

                    • 7. Re: Comparing values in different columns and excluding those that match
                      Tom W

                      The problem is that your data is 'deep'.

                       

                       

                      You've created calculated fields like CED which give you the appearance that you can compare these items like they are columns in an Excel spreadsheet but the reality is, they aren't. This is best demonstrated by looking at your Source System dimension alongside with each of the calculated fields you created;

                       

                      When you use a calculation like 'CED' alongside the 'Source Sys' dimension, it's only going to return 'SUNNYVALE' for the 'CED' row in the Source Sys dimension.

                       

                      The solve here is to use aggregates to figure out if everything matches. I.e. does the minimum value of the derived employee location field for Employee No 00002 match the maximum value? MIN and MAX will work on strings as well.

                      If you use a Level Of Detail calculation, you can force the calculation to run at the Employee Level while still showing the source system level of detail.

                       

                      Here's an 'all match' calculation:

                      if {FIXED [Employee No]: max([Derived Emp Location])} = {FIXED [Employee No]: min([Derived Emp Location])} then 'ALL MATCH' else 'NOT MATCHING' end

                       

                      If you add that into your report you'll see it's working!

                       

                      However, this won't deal with NULLS. I.e. EMP0056 only has records for OIM and PROWATCH. Does that mean you would include it or exclude it in your report? You could run a second check which is 'Is the Employee in all source systems?'. Similarly, you'll use a Level of Detail calculation to calculate the total number of source systems the Employee is in. If it doesn't match your number of source systems (6 in your case), flag it;

                       

                      IF {FIXED [Employee No]: count([Source Sys])} = 6 then 'AllSources' else 'NotInAllSources' end

                       

                      You can combine the logic together to get the result you're after i.e.;

                       

                       

                      if {FIXED [Employee No]: max([Derived Emp Location])} != {FIXED [Employee No]: min([Derived Emp Location])} || {FIXED [Employee No]: count([Source Sys])}  != 6 then 'INCLUDE' else 'EXCLUDE' end

                       

                      I've also attached a trimmed down version you can take a look at.

                      Wow this got long...!

                      • 8. Re: Comparing values in different columns and excluding those that match
                        Sam R

                        Thank you so much. this works great. appreciate your detail breakdown on the issue and solution.