8 Replies Latest reply on Apr 10, 2017 3:37 PM by TJ Lensing

    Best way to structure, store, and calculate hierarchy data?

    TJ Lensing

      Hi, I'm after some advice!


      I am wanting to create a world map where each country in the world displays a value between 0-100. Then users can click on a country and be able to drill down into the underlying data that determines that value (perhaps as a separate table in a dashboard). The country's value is calculated from values (0-100) assigned at the very end of a tree structure and then averaging those values up a series of hierarchies to get to a final value.


      For example, each Country has a Category A with three branches (Weather, Outdoors, Food). Each Category A branch has its own Category B branches, etc... Values are only ever assigned at the very end to B.


      But then B values are weighted and averaged to arrive at an A branch value. Then A branch values are weighted and averaged to arrive at a Country value. So the U.S. would get a value of 45.4.


      Users would see a map with the U.S. showing 45.4 and be able click on it to see the A branch values and B branch values in a table or chart if they wish - with some sort of filtering options available.


      So! I have questions about the feasibility of: this way of approaching structuring the data, visualizing this in Tableau, and whether a spreadsheet or database would be better to house the data. Once built, we'll want to grant access to individuals, as well as be able to embed/display it on outside websites as well. The current thinking is that we'll purchase Tableau Online.


      In the example above there are 3 countries with 7 different values each. Ultimately, the entire data set will be for about 200 countries with about 50 values each, and multiple interim branches - perhaps out to F. So, connection latency as a potential issue is coming to mind.


      The idea of all of this in a single page Excel file seems overwhelming. Especially as the structure could ultimately add or subtract some branches, categories, values, or countries. Some thoughts I'm having:


      • Should the data be broken up across multiple Excel sheets and/or workbooks somehow?
      • Would the calculations in the orange area be better handled by Tableau than Excel?
      • Should the data be organized into a database like a cloud-based service instead of Excel?
      • If using a database, would the calculations in the orange area be better handled by SQL?


      I'm fairly new to Tableau and databases, so have lots of different ideas floating around in my head. I'm reading about hierarchies and calculated fields and data extracts in Tableau. I'm reading about MySQL and considering learning how to do some querying/calcs. I'm reading that in Tableau I could join multiple Excel spreadsheets/pages. I also read the multidimensional hierarchies only work in Windows (http://onlinehelp.tableau.com/current/pro/desktop/en-us/buildmanual_multidimensional.html)? All this leaves me a bit overwhelmed.


      At the very least, I'd love any opinions that could help me eliminate which routes are obviously bad ideas - because I don't know!


      Thanks for entertaining such a nebulous request. I'd be very grateful for any pointers in the right direction.


      Thank you!

        • 1. Re: Best way to structure, store, and calculate hierarchy data?
          Sarah Battersby

          Hi TJ Lensing,


          Wow - this is quite a question!  It sounds like fun though... would you be able to share the Excel file that you used for the images in your question?  Or at least share a subset of it?  I'd like to take a shot at seeing what I can do to help out.  My initial thoughts on the questions are that one Excel sheet is probably fine if the data really looks like the example pictures, the calculations in Orange would probably be good to do in Tableau.  As for putting the data in a cloud-based service instead of Excel - that's probably just a personal preference issue, and if you preferred to use a database, I would still probably do the Orange calculations in Tableau...




          Tableau Research.

          • 2. Re: Best way to structure, store, and calculate hierarchy data?
            TJ Lensing

            Hi Sarah,


            Wow, thanks so much for replying! I know it's a lot, but I just figured I'd throw it all out there and hope for any insights I could, so I appreciate it very much. And...I meant to attach the Excel in the original post. Here it is now. The "all" tab has all of the data, but some of the other tabs have structures I've been experimenting with.


            In my very new experience, I'm agreeing with you that doing the calcs in Tableau seems better, and sticking with Excel as opposed to a database. I mean It probably wouldn't be more than 10,000 rows and more than 15 columns. I read some more about storing hierarchy data like this in a database and it seems like that might get unnecessarily complicated quickly: sql - Database Structure for Tree Data Structure - Stack Overflow However, in Excel, flexibility in updating could get messy, like if I wanted to add another type of item within A, I'd need to do that 200 times. Well, one thing at a time


            One idea that partially worked was to store all of the "Country, A, B, Value" dimensions (is that the right word?) in one sheet ("main data" tab A-D columns), then in a separate sheet just the "A weights" (so as not to have to repeat them every time), and in a separate sheet "B weights", then do a join in Tableau. That worked nicely as an option for reducing redundancy.


            I'm struggling with accomplishing some of the aggregated averages calculations in Tableau, but I'll need to keep experimenting in order to get more specific on what to even ask there.


            Anyway, thanks so much for your reply and willingness to poke around at it. Very appreciative!




            • 3. Re: Best way to structure, store, and calculate hierarchy data?
              Sarah Battersby

              Hi TJ,


              This was a fun project for the morning!  I grabbed your Excel data and threw it into Tableau and mimicked the calculations that you had in the Excel workbook.  Here are the basics of what I did:


              • Since your data was broken up into tabs for the data values (with country, a, and b) I did a join with the "main data" page (I didn't use the page with just the values on it because of the extra value at the bottom of the list - it left me with some 'nulls' that I had to then filter out because Tableau just sees that row as a row that wouldn't join...since it had no country, A, or B).
                • Left join with A weights - the left join made sure that everything on the original sheet is kept, just in case there are rows that don't match to a weight
                • Left join with B weights
              • To get to the final country average calculation you could just do one big level-of-detail (LOD) calculation, but I broke it up into a few calculated fields so that I could duplicate the table in Excel and so that you could see how each calculation is done... To try to make them easier to find in the list I started each with an underscore and tried to roughly match the names of the columns in your Excel file.
              • Once the calculations are done, the fun part is making the dashboard:
                • Separate sheets for an A Branch Value table and B Branch Value table (I may have messed up the calculations for these individually with respect to how you want to report the weighted values, so let me know if it doesn't make sense and you can't figure out how to fix it to return what you want in each individual table)
                • Worksheet for the map, with the three countries
                • Throw them all on a dashboard and make the map work as a filter - select the map sheet in the dashboard, look in the upper right corner of the map, and click on the little funnel (use as filter...I circled it in red below).  The tables should automatically update based on the matching country fields.  Example below shows selection of US with the tables filtered.





              Does this seem to be what you're getting at doing?  When you're working with your full dataset, you should be able to follow the same general pattern.  Just put your main data in and join your weighting fields.  Then a few quick calculations and you're good.


              You'll want to read up on Level of Detail calculations (Overview: Level of Detail Expressions), once you've written a few it starts to make a lot more sense


              I've attached a workbook with my examples for you to use as reference. 



              • 4. Re: Best way to structure, store, and calculate hierarchy data?
                TJ Lensing

                Hi Sarah,


                I thank you so much for taking your time to try to help me out here, I really appreciate it! In general, I believe I follow what you're doing, but I think I'll get a clearer picture once I can access the .twbx file.


                I don't own a Tableau product at the moment (I've been using Public for now as we are researching approaches). It seems I can't open your attachment in Public. I also downloaded and installed Reader, but I was only seeing a bar chart with some tabular text data there. I'm thinking I'll try a free trial of Tableau Desktop next.


                Anyway, once I can get into the file, I'll write back asap. And thanks for the link to the LOD overview. That looks great. Being a newb, some of the terminologies have been confusing me some: Sets, Hierarchy, Groups, Aggregates, LOD, Extracts... But I am getting there slowly but surely


                Well, thank you again, I'll be in touch soon!



                • 5. Re: Best way to structure, store, and calculate hierarchy data?
                  Sarah Battersby

                  Hi TJ,


                  I have put the file up on Tableau Public now so you can at least take a look at it live (dashboard with map and tables, and a few additional worksheets) while you ponder a trial of Tableau Desktop


                  Let me know what questions come up and I'll see how I can help out.



                  • 6. Re: Best way to structure, store, and calculate hierarchy data?
                    TJ Lensing

                    Hi Sarah,


                    Thank you for adding it to Tableau Public! I beat you to it and started a trial of Tableau Desktop


                    I opened your workbook and my jaw dropped as I realized that you got all those calcs to work! I've been studying your workbook and follow everything except the details of those LOD statements (Fixed...etc) in the "_A branch avgs" and "_Country Avg" measures. But I see your comments in the editor there and have the link to the documentation, so studying that is going to be this afternoon's task. It's clearer now that those are the parts doing the heavy lifting.


                    The other big lightbulb that went off was how you put the Measure Values measure on the Text area and then filtered it to generate those additional columns. Brilliant!


                    I can't thank you enough for helping straighten me out. This is amazing. I'll be tinkering around with it some more, but I believe we have a proof of concept here, for which I thank you immensely!!



                    • 7. Re: Best way to structure, store, and calculate hierarchy data?
                      Sarah Battersby

                      Hi TJ,


                      I'm glad the workbook has been helpful!  In terms of working with LOD expressions, I'll share the mental trick that I've used to make sense of them... I think about it in real words and not weird LOD calculation syntax first.  I translate fixed to "group by" defining what I want to aggregate over.  So if I want the sum of sales for every region group, I think about it as  group all of the rows for my regions into one big clump, and then count up the attribute for each region clump.


                      group all rows for my regions would be the fixed [region] on the left side of the LOD

                      sum up the attribute for each region would sum[sales] on the right side


                      And then I string it all together:

                      {fixed [region]: sum([sales])}


                      And if I want one value for every combination of region and subregion it would be fixed [region], [subregion]: sum([sales])


                      because the LOD expression now says - {for every unique combination of region + subregion : sum up all of the rows }


                      Enjoy studying up on LOD expressions



                      • 8. Re: Best way to structure, store, and calculate hierarchy data?
                        TJ Lensing

                        Got it! Wow, that trick/approach makes a lot of sense! I didn't end up getting to study/practice today, but now I'll be even more prepped for when I do! Again, thanks so much Sarah!