7 Replies Latest reply on Sep 7, 2018 2:39 PM by Jana STUPAVSKY

    Data source joining, matching the rows and hierarchies


      I have workforce-related files from 100+ ZIP codes and thanks to my previous thread I was able to figure out how to prep them to certain extent using Tableau Prep. I am seeking more help. 

      I am attaching two samples to paint a picture for my two challenges - they are files named in the following pattern County_City_ZIP.xls. I was able to pivot them, join them and unionize them with Tableau Prep and Desktop. Here is my next set of challenges:


      If you notice, these two files, under "Occupations" tab, have a column header SOC and each SOC corresponds with an occupation description in the next column. Imagine, after establishing a union, my prepped CSV file (let's call it a Master file) has a long repetitive SOC column (and description column), because they repeat themselves across 100+ ZIP codes. Now I need to bring in the other file I have attached here: soc_structure_2010 to add some more columns to my Master file.

      In this soc_structure_2010  file, the "Detailed Occupation" (D) column should match with the SOC column from the Master, and I would like to include in the Master the corresponding hierarchically higher groups from previous column - A, B and C, where, I also would like that blank cells are filled out all the way through. Once that's done, I would like to be able to establish hierarchies. 


      Please note - the soc_structure_2010 may have more codes available than I have in my Master file, so I need to make sure these are matched properly.


      Anybody who can help me with this? Is this a Prep or just a Desktop joining challenge? In either case, could you provide some steps to follow please? 

        • 2. Re: Data source joining, matching the rows and hierarchies
          Jana STUPAVSKY

          Yes it's my thread. But I am not sure how that can help me bring in the new spreadsheet I am trying to incorporate.

          • 3. Re: Data source joining, matching the rows and hierarchies
            Jana STUPAVSKY

            OK - I think I made some progress but I am very unsure about the result.


            I basically just merged the columns from the soc_201_structure files. That turned out to be the easy part - at least it looked like it. 


            Then I brought it into the attached workbook and that is where my confusion starts:

            1) did I join the data properly?

            My green SOC 2010 column and green Occupation description2010 column should include all of the blue SOC 6-digit column options and blue Occupation Description options respectively, plus more. But I want to make sure that data from the blue columns match up with the corresponding rows of the green columns. Makes sense? Then, if we can figure that out, my next question is:

            2) I have a need to create occupation group hierarchies, feeding from the two green columns. See the excel sheet to understand the groupings. Those green columns include broader groups of the most granular data that are in the corresponding blue columns and I have to figure out how to enable the user the drill up or down the occupation group hierarchies. I started with a formula in SHeet 5 but didn't get far,


            Any help will be much appreciated.

            • 4. Re: Data source joining, matching the rows and hierarchies
              kumar bharat

              Hi Jana,

              Glad to know you made some progress.



              • 5. Re: Data source joining, matching the rows and hierarchies
                Jim Dehner

                Good morning Jana


                see the attached


                not 100% there but something to think about


                Yes you joined the dataset together properly


                there is a tab (JD check) that lest the descriptions and code  - there are 4 values without a match in the data





                don't know why -



                the tab Label Hierarchy produces this



                and it uses a hierarchy with the following


                and this applies the occ des



                you can see what it looks like fully expanded - when you collapse the hierarchy you get this



                I tried a second approach tab Hierarchy 2 using this for applying the occ des

                what it does is this




                Don't know what you wanted to show




                If this posts assists in resolving the question, please mark it helpful or as the 'correct answer' if it resolves the question. This will help other users find the same answer/resolution.  Thank you.

                1 of 1 people found this helpful
                • 6. Re: Data source joining, matching the rows and hierarchies
                  Jim Dehner


                  don't mean to mess up your weekend but its raining here in Nashville so I had time to work on this


                  here is a third approach using a different label formula with the same hierarchy



                  when there are multiple detail occ codes it give the max and min label like this


                  • 7. Re: Data source joining, matching the rows and hierarchies
                    Jana STUPAVSKY

                    Hi Jim - I was finally able to get to this, my apologies. I studied your approach and tried to replicate that. For whatever reason it-s not working for me yet, but I see what you did - you dissected each 6 digit SOC code (the most granular one) into three pieces to create the labels and then you worked back and built major minor and broad categories SOC codes. It helped great deal.


                    However, like I said it's not giving me the same numbers in my worksheet, but more importantly, I think I was looking for something more than the solutions you provided and I think it's very easy, my brain is just on freeze.


                    If you look below - this is from your workbook you sent me, I have the hierarchy drilled all the way down. (Jana 9/7 worksheet in the attached workbook)

                    But if I start collapsing it back, the numbers at the broad, minor and major group remain the same. I am looking for those numbers to be summed up (thus grow) for each respective group as you  hit the minus sign (and vice versa - numbers break down and are distributed among respective occupations as you hit the plus sign). With drilling and collapsing the hierarchy, I am also looking to be able to view the Occupation descriptions for major, minor and broad groups. Right now I am only able to see the detailed description. Those should be pulled from the Excel file that I attached and I have the data joined (hopefully correctly). I feel like the glitch may be somewhere there in my joining but I am unsure.   


                    Just to make sure I am expressing myself correctly, if you follow the Excel sheet I attached,

                    I want to be able to show:

                    11-0000 Management Occupations 500 Jobs (the number is made up)

                    then as you hit the minus sign, the 500 jobs break down to:

                         11-1000 Top Executives 100 jobs

                         11-2000 Advertising, Marketing, Promotions, Public Relations, and Sales Managers 100 jobs

                         11-3000 Operations Specialties Managers 100 jobs

                         11-9000 Other Management Occupations 200 jobs

                    and so on all the way down to detailed occupation 6-digit SOC code. Makes sense?


                    I will be digging further, as I feel like I am close, but any suggestion you may have would be great.