9 Replies Latest reply on Apr 2, 2012 1:36 PM by John Mogielnicki

    Table Structure

    John Mogielnicki

      Hello.  I am trying to figure out what the best structure is for our data to make it the most flexible and nimble for use in Tableau.  I have attached a spreadsheet with two potential formats.  The first is our current data format, the second is a proposed new format.  I have read the knowledge base articles on this subject (http://kb.tableausoftware.com/articles/knowledgebase/denormalize-data and http://kb.tableausoftware.com/articles/knowledgebase/preparing-excel-files-analysis), which seem to suggest that the current data format would be more flexible, since it is closer to a "list" format.  However, I have been running into limitations in how I can use that format in Tableau.  I will attach a Tableau packaged workbook to illustrate the issue. 


      The first view was relatively easy to create with the data in the proposed format, because each metric is split out and can be added to the view (or used in a calculated field) separately.  The list format seems to be more complicated.  For instance, I can't figure out a way to have the projected and goal units be a shape while keeping the actuals as a bar.  And I cannot figure out what table calculation would replicate the "units percent to goal" field I created.  With the proposed format, where the metrics are split out, this was simple: sum([Units Actual])/sum([Units Goal]). 


      Is there a way to create the view using the current data format?  Are there limitations to the data in the "proposed data format" that I am missing?  Which format would you prefer if given the choice?


      This decision is going to affect the way we capture and store data from here on out, so I want to make sure we get it right.


      Thank you in advance for any advice or assistance.

        • 1. Re: Table Structure
          Alex Kerin

          You could use the current version and use calculated fields like:


          [Directs Actual]


          if [Budget Type]="Directs" and [Summary Type]="Actual" then [Total Dollars] end


          But I think the proposed will be easier. I would suggest using bullet charts for your actual/forecast display. If you want to use different shapes you could use the multiple marks feature.

          1 of 1 people found this helpful
          • 2. Re: Table Structure
            John Mogielnicki

            Thanks Alex, I appreciate your thoughts.  Do you see any downsides to putting the data into the "proposed" format?


            I would love other opinions if anyone has any to offer up.

            • 3. Re: Table Structure
              Alex Kerin

              I really don't see any downsides, but others may chime in.

              • 4. Re: Table Structure
                Richard Leeke

                There was a good thread on this topic a couple of years back which discussed some of the pros and cons of the two approaches. That was back in version 5 days, so it's possible that some of the issues have changed a bit with the advent of table calculations and other things. I imagine that knowledgebase article may capture the main points, too (I haven't looked). Anyway, here's the thread. http://community.tableau.com/thread/105170


                I also had a quick go at reproducing your chart with your current data structure - a couple of sheets showing where I got to quickly.

                1 of 1 people found this helpful
                • 5. Re: Table Structure
                  Alex Kerin

                  Maybe there's a middle ground that would work where Units is in one column with actuals/forecast, then labor in one...

                  1 of 1 people found this helpful
                  • 6. Re: Table Structure
                    John Mogielnicki

                    Thanks Alex and Richard, I really appreciate the thoughts.  Richard, I really like your reproduction of the view using overlapping bars of different sizes.  I think I may change to that format for my dashboard.  Thanks for the idea. 


                    I still find there to be strange difficulties in trying to produce certain information using the "Table Deep" structure (what I call the "Current Data format" above).  For instance, the ">10 nulls" message that pops up in your workbook when you try to split out forecasts and goals from actuals.  I ran into that same issue, which is very frustrating.  I also can't figure out a way to show actuals as a percentage of forecast.  This is very easy using the "table wide" structure (what I refer to as "proposed data format" above).  I just create a calculated field that is "Sum([Units Actual])/sum([Units Forecast].  This seems to be surprisingly difficult to accomplish in the table deep structure however.


                    I think I'm going to stick with Table Wide due to these limitations.  I'm just worried that I will run up against some unforeseen limitation to this structure in the future. 


                    Thanks again for the help guys.  Any last words of advice or caution would be much appreciated.


                    P.S. - I would also recommend that anyone reading this thread with a similar issue check out the link in Richard's post above.  It has some good food for thought. 

                    • 7. Re: Table Structure
                      Bruce Segal

                      Joe: I looked at your .xls and have an alternative for you to consider. It might be the same as the middle ground proposed above.


                      Because you want to compare Forecasts to Actuals to Projected Amounts, have only one column for each $ amount for a month (i.e. combining all the cols you have in your proposed structure for each diff't "budget type") and keep the program type, record type and budget type columns in your existing structure.


                      Then do the same for Units. That should give you 3 measures of Actual $, Projected $, Goal $, that you can compare against each other. See the attached .jpg. BTW you can use a naming convention to have the measures appear in your .tbw in a useful way. E.g. Naming them Actual $, Actual Units, Goal $, Goal Units will group them by your "Summary Type." Or naming then $ Actuals, $ Goal, Unit Actual, Unit Goal will list them by $'s or Units.


                      Bruce Segals Proposed Data Structure.JPG

                      • 8. Re: Table Structure
                        John Mogielnicki

                        Thanks Bruce.  I really appreciate everyone's ideas on this topic.  I also asked this question of the Tableau technical team.  I thought I would post the replies from them so that anyone interested in this topic could have all of the information possible.

                        • 9. Re: Table Structure
                          John Mogielnicki

                          Initial question for Tableau Tech Team:


                          I am responsible for creating an executive dashboard for our client using Tableau.  This is a proof of concept that will allow us to show that Tableau will work for our needs, which will be the deciding factor in our purchase decision for Tableau Server.  I am trying to figure out what structure our data needs to be in to be the most flexible for use in Tableau.  I have read http://kb.tableausoftware.com/articles/knowledgebase/denormalize-data and http://kb.tableausoftware.com/articles/knowledgebase/preparing-excel-files-analysis, which suggest that the best data format for Tableau is a list. I understand that this provides Tableau with the most flexibility.  However, I have run into a situation in which having the data in strict list format is causing some limitations for me in how I am able to visualize the data.  I have attached an excel spreadsheet that demonstrates the structure that the data is currently in (1st Tab), as well as an alternative structure that I have found more flexible in Tableau.  The issue is that we need to do table calculations on the data.  For instance, we need to figure out our percent to goal for units, which involves dividing our total actual units by our total goal units.  I can't figure out how to do this calculation in Tableau with the data in it's current form.  Please help me figure out the best format for my data so that we can move forward with this proof of concept and purchase Tableau Server.


                          First Response from Tableau:

                          Thank you for contacting Tableau User Support.


                          Accomplishing the calculation with the first data source is possible, but performing calculations on data in this format becomes much more limiting and difficult than the proposed format. While a strict list format is often more helpful for visual displays of data, the same thing can be accomplished by dragging and dropping more fields onto the columns or rows. When performing calculations on those values, your proposed format will generally be easier to work with.


                          In the proposed format, such a calculation is easily accomplished by utilizing a calculated field. This calculation can be created by right-clicking anywhere on the left toolbar and selecting the 'Create Calculated Field' option. Then type the following formula into the Formula box:


                          SUM([Units Actual])/SUM([Units Goal])


                          Once this has been accomplished, click 'OK'. Drag the new calculated field anywhere onto the worksheet to see a bar or numeric output of the value. The data will default to a decimal value. To fix this and see the desired percentage, right-click your calculated field, select 'Field Properties', and click 'Number Format'. Select 'Percentage' from the list, click 'OK', and the number/bar should update accordingly.


                          Attached is a packaged workbook which demonstrates this formula, along with a few views which juxtapose the two data types with different visual formats.


                          Follow up question for Tableau Tech Team:

                          Thank you for your response Ben. 


                          Two follow up questions:


                          1. Are there limitations to the data in the proposed format that have the potential to give us trouble down the line?  This is important because it is time-intensive for us to change the data structure.  Because we are proposing a change we want to make sure we're giving the data the ideal structure to make it the most flexible for Tableau, beyond the example I gave you about percentage of actuals to goal.
                          2. Is there a rule of thumb when deciding between table wide and table deep data structures? Does Tableau provide any guidance that can help us make these decisions on our own in the future?


                          Second response from Tableau:

                          I have discussed the situation with some other consultants, and we have come to the following conclusions.


                          1. While a few limitations exist with the proposed format, most things are easily calculable. An example of the kind of calculations not possible with data in this format is calculations similar to these: http://kb.tableausoftware.com/articles/knowledgebase/survey-data-grid-questions. However, I would actually warn against the new proposed structure simply because this can easily become less manageable ( imagine if the budgets increase by 10x).  Instead, ideally I would suggest a hybrid structure, where goals and actual have separate columns, but where each budget is still defined as separate rows.
                          2. A good rule of thumb is that databases typically work better with long data rather than wide data, and as Tableau is structured more or less as a database tool, we try to replicate that experience. .  However, each column of measures should be defined on its own scale (thus not mixing actual and budgets). One of the demo datasets that comes with Tableau (Sample – Coffee Chain) is actually a good example of this structure. If you are trying to accomplish calculations using counting or running totals, something like the original format is best, but when trying to perform something similar to the Percent Goal to Units, the proposed format is best. It is important to point out that this calculation and the running total/counting calculations can all be accomplished in either row. For the current question, try using the following formula – if it works for you, it may save some time that would be spent manipulating data:


                          SUM(IF [Summary Type]=”Actual” THEN [Total Units] END)/SUM(IF [Summary Type]=”Goal” THEN [Total Units] END)

                          1 of 1 people found this helpful