1 2 Previous Next 16 Replies Latest reply on Jun 3, 2011 2:56 PM by Mihir Desu

    Computing Percent of Total using  Underlying Data

    Erin Easter

      Recently a friend of mine asked me for help with a problem that many of you may have encountered. She was analyzing data from a survey. One of the questions on the survey was, "Out of these 50 movies, which one is your favorite?"

      In her analysis she wanted to look at the top 10 movies based on number of respondents who chose it as their favorite. Then she wanted to see the percent of all respondents who chose each of these 10 as a favorite. For example, Shark Tales may be the number 3 favorite movie with 6% of all respondents choosing it as a favorite. Our first inclination was to create a filter that only includes the top 10 based on number of respondents, and then use a table calculation to compute the percent of total. The problem there is that table calculations are, like the name says, based on the values shown in the table--not the underlying data. The table calculation computed how many people chose each movie out of the people who chose one of those top 10 movies, not all respondents.

      So we had to get a little more tricky by creating a new calculated field.

      Here's how it works



      2. Let's say the view shows Movie Title on the Rows shelf and Count(Number of Records) on the columns shelf.


      4. First filter the move title field to show the top 10 by Count(Number of Records). You can do a top filter by dragging the field to the Filter shelf and then switch to the Top tab.


      6. Now instead of using a table calculation to get the percent of total, create a new calculated field by right-clicking the data window and selecting Create Calculated Field.


      8. Give the calculated field a name (something like Percent of Total).

      10. Type a formula like the one shown below (using your actual field and table names of course):


        COUNT([Number of Records])/RAWSQL_INT('Select COUNT(*) From "SHEET1$"')

        This formula uses a RAWSQL function, which means that the specified SQL query is simply passed to the data source and Tableau doesn't try to interpret it at all. You can probably figure out that the sql query in the above formula returns an integer that is simply a count of all the records. I should mention that the RAWSQL functions are not supported by local file data sources such as Excel and Text files. You can extract your data
        (select Data > Extract) from Excel or Text files so you can use these types of functions.


      12. Click the Check Formula button to make sure the calculation is valid and then click OK.


      14. Now, use the new field you have just created as the measure in your view. So the view would have Movie Title on the Rows shelf and AGG(Percent of Total) on the Columns shelf.


      16. Right-click the axis and select Format.


      18. Make sure the Axis tab is selected and then select Percentage as the Number format.

      I hope you find this useful. I've attached a packaged workbook with an example. You could use the same technique for any time you want to see the percent of total using the underlying data instead of the values shown in the table.

        • 1. Re: Computing Percent of Total using  Underlying Data
          Dan Gerena

          I don't see the packaged workbook?

          • 3. Re: Computing Percent of Total using  Underlying Data
            Andy Cotgreave

            Good post, Erin. We need to report on college surveys quite often, so this kind of thing may turn out to be useful.


            One problem though, I was exploring the workbook, and came up with the following issue/bug:


            1. Open the packaged workbook

            2. Click on Data... Extract...

            3. Tableau shows an error message saying "could not find referenced file" (the MovieSurvey.xls the original extract is based on) "Replace it with another file"

            4. Click No, and I get "An error occured while communicating...."

            5. I can only click OK

            6. Then I'm back at step 3 - error message about missing file.


            That leaves me stuck in a loop of dialogue boxes; I can only exit Tableau by killing it through Task Manager.

            • 4. Re: Computing Percent of Total using  Underlying Data
              Erin Easter

              So the workbook uses a data extract from an excel file. When I saved the workbook as a packaged workbook, only the Tableau Data Extract file gets included. When you select Data > Use Extract you are telling Tableau to use the original excel file instead of the extract file...which obviously cannot be found, thus the error message.

              I've attached the original excel file so you can take a look at it and see how it is formatted. If you connect to it instead of the extract file though, the RAWSQL function won't work because Excel does not support it.

              I hope this helps. Sorry I didn't attach the excel file first off.

              • 5. Re: Computing Percent of Total using  Underlying Data
                Richard Leeke

                I'm interested in the use of "Sheet1$".  It is clear from context that it means the underlying data source for the sheet - but I can't find any reference in the documentation to being able to do that.  Are there any other constructs like that to allow access to the context of the current sheet?


                The reason I ask is that I've been wrestling with trying to work out how to do a calculation which is similar to the problem in your movie example - in that the denominator in the expression needs to be aggregated over a wider set of underlying rows that the numerator - though in my case not over the whole table.  What I think I really need is access to the contents of the various shelves within a RAWSQL expression.


                I'll post a separate question with my specific problem to avoid hijacking this thread...

                • 6. Re: Computing Percent of Total using  Underlying Data
                  Dirk Karis

                  A clarification here: Sheet1$ is not a magic name, and it has nothing to do with the fact that the Tableau worksheet is named "Sheet 1". 


                  The underlying table in this case is an Excel worksheet, and the convention used by the SQL interface to Excel (we use the Jet engine to access Excel workbooks) is that each Excel sheet becomes a table named sheetname$.  Remember that this is Excel sheet name, not the name of the entire workbook, which may contain many sheets.


                  If you were connecting to some other data source, the table names would be different.  If you don't know what they are, you can see them

                  - Data->Connection->Edit...

                  - change the connection type to Custom SQL

                  - copy the contents of the text box to your favorite editor

                  - Be sure to hit Cancel so you don't actually change anything.

                  This won't work on packaged workbooks, unfortunately, because you can't edit the connection.

                  • 7. Re: Computing Percent of Total using  Underlying Data
                    Dirk Karis

                    One more way to get the table names, if you're comfortable reading SQL queries:

                    - Make your best guess as to what they are

                    - create the calculated field

                    - Use the calculated field on the view

                    - If it fails, you'll get an error dialog. Click "Show Details" and then "Show Query".  You should be able to find the tables name Tableau was actually using.

                    • 8. Re: Computing Percent of Total using  Underlying Data
                      Richard Leeke

                      Thanks for the clarification about Sheet1$ - it seems entirely obvious now you say it - I just hadn't noticed the datasource was a spreadsheet.

                      • 9. Re: Computing Percent of Total using  Underlying Data
                        Dirk Karis

                        You don't always need to change to Custom SQL to see the table names.  If it's on Multiple Tables, you will also see the table names.  If it's a SQL database, you see the correct names even on the Single Table page.  But for Excel, the Single Table page shows the sheet name (Sheet1) while the actual table name is slightly different (Sheet1$).

                        • 10. Re: Computing Percent of Total using  Underlying Data
                          guest contributor

                          What if you had to do the same thing for a dynamic date range?  how would you RAWSQL_INT change?

                          • 11. Re: Computing Percent of Total using  Underlying Data
                            Christine Capra

                            I realize this is an old post, but this is what I need.  I have followed the example and I get the error: table name not recognized. The Excel name is 'Master Live', so it should become 'Master Live$" if I go to the SQL statement as suggested, the table name of the Extract is TableauTemp, if I use this name the calculated field runs but I do not get the unfiltered total.

                            Any suggestions?


                            • 12. Re: Computing Percent of Total using  Underlying Data
                              guest contributor

                              This is something we have been looking to use for a while now.  Thank you so much for this post!

                              • 13. Re: Computing Percent of Total using  Underlying Data
                                Joe Mako

                                Sue, Just so you are aware, this is an old thread, and since the release of Tableau version 6, there are additional methods to perform this calculation, such as with custom table calculations like TOTAL().

                                • 14. Re: Computing Percent of Total using  Underlying Data
                                  Mihir Desu

                                  Do you know how to compute this calculation in Tableau 6 over a dynamic date range? I have a set of energy price data ranging from negative values to positive values for each hour in a few years (2007-2011).  I wanted to analyze the positive prices so I put a filter in that removed the negative prices.  I then multiplied the average energy price when the price was positive by the number of hours the price was positive in the given year.  This generates a revenue value for each year.  I want to find out the revenue generated per hour during a given year.  So I want to divide by the unfiltered number of hours in each year.  My problem is the data set for each year is different.  So I can't just divide by an integer like 8760, which is the number of hours in a year.


                                  Any help would be appreciated. Thanks in advance!     

                                  1 2 Previous Next