6 Replies Latest reply on Feb 14, 2018 6:53 AM by Michael Hesser

    How to automatically replace composed surnames

    paula.munoz

      Hello Community,

       

      I was wondering if you have an idea to my current issue... I'm trying to clean a huge file that contains names (The Full name is in one single field)... As part of the analysis I want to do, I want to split it between names and surnames, however before doing that I noticed that there are several surnames that are composed (two words) for example "Da Silva".

       

      So what I want to do is to do a search for all names that contains these type of surnames and when a composed surname is found to replace the space in between with a "-"... For example DA SILVA will become "DA-SILVA"

       

      I wrote a calculated field that is working ok, however it is a very manual process since I have to search  and specify all the composed surnames.

       

      So, here is a small sample of the data:

       

       

      I wrote this calculated field:

       

       

      When I use the calculated field, it it takes care of the corresponding names:

       

       

      I want to be able to achieve the same outcome without having to specify each composed surname within my calculated field... Any ideas???

       

      Attached is my sample workbook in V 10.5

       

      Thanks for your help.

       

      Paula

        • 1. Re: How to automatically replace composed surnames
          Jacob Goffin

          Hi Paula,

           

          I've provided a potential solution below.

           

          My approach was to use Tableau's custom split option, which lets you choose a character (in this case a space " ") to break out a field by. With your data, it creates three new fields: "Name - Split 1", "Name - Split 2", and "Name - Split 3" representing the first, middle, and last parts of a persons name. Using those parts, we can reconstruct the name to include the hyphen. Call this calculated field Reconstructed Name: [Name - Split 1]+" "+[Name - Split 2]+"-"+[Name - Split 3]

           

          See the attached workbook for the implementation.

           

          Cheers,

          Jacob

          • 2. Re: How to automatically replace composed surnames
            paula.munoz

            Hi Jacob,

             

            Thanks for looking at this... I did try something similar, however the problem is that not all the surnames are composed, usually are the only ones preceded by "DA", "DE", "DO", "DOS" .. With your calculation it will add the hyphen to all surnames even if they are not composed... For example in my sample data the last three names do not have composed surnames:

             

            Any other thoughts?

            • 3. Re: How to automatically replace composed surnames
              paula.munoz

              Hi Jacob,

               

              Me again... I just tried to modify your calculation a little bit, and maybe I can do something like this:

               

              My other problem would be that in the real data there are full names with more than three names... The maximum I saw were 9, the most common are 4, so I could not split just in three, and also with the long names, I wouldn't exactly know where the preceded word is, like in this case we know is on Name-split 2.

               

              So, I would need something that could get applied to different lengths of names.

               

              Any thoughts?

              • 4. Re: How to automatically replace composed surnames
                Michael Hesser

                I like your initial approach-- but rather than splits... is it easier to write a formula that will look for " DA ", " DE ", " DO ", and " DOS " (with spaces preceding and following), and replacing just those for with their correct counterparts?

                 

                REPLACE([LONGNAME]," DA "," DA-")

                 

                If you NEST all four, you may be able to tackle some complex names. It won't, however, work on a name like "DA DA DA DA DA DA)

                 

                DOS EQUIS ==>                                   DOS EQUIS (not changed because no leading space)

                SILVANA DA MAIA  ==>                        SILVANA DA-MAIA (changed)

                DAVID DE LUCAS ==>                          DAVID DE-LUCAS (the DA in DAVID is not changed due to lack of spaces, but the DE is modified)

                SILVANA DA DE DO DOS MERA ==>   SILVANA DA-DE-DO-DOS-MERA

                DA DA DA DA DA DA DA ==>               DA DA-DA DA-DA DA-DA (Note how it misses some)

                 

                Would this fulfill your criteria without having to worry about having to list all those last names?

                1 of 1 people found this helpful
                • 5. Re: How to automatically replace composed surnames
                  paula.munoz

                  Hi Michael,

                   

                  YES, I just tried this out and it works... LOL, I'm not sure why my initial thought was to list all the composed last names when I could just write the few preceded words... LOL, that is why I love the Tableau community, more heads think better than one.

                   

                  This is what I juts did, and it works:

                   

                   

                  Thanks so much for bringing this thought to my attention.

                   

                  Have a nice day,

                   

                  Paula M

                  • 6. Re: How to automatically replace composed surnames
                    Michael Hesser

                    You might try NESTING those and just making 4 replaces.

                     

                    Please mark the answer helpful or correct if it worked!

                    1 of 1 people found this helpful