10 Replies Latest reply on Aug 12, 2013 9:57 AM by Jonathan Drummey

    Adding up only the unique records (text)


      I've searched all over the forum but can't seem to find a solution. Seems simple enough so many I'm missing it. I just need to add up all the unique records of a text column. However, I don't want to use the cntd which requires me to use an extract. I'm connecting to a live excel spreadsheet.


      Any ideas or quick solutions?

        • 1. Re: Adding up only the unique records (text)
          Alex Kerin

          What are you going to do with the number? If it's not being used in a calc somewhere else you could just use the pill on the rows dimension and then look at the "rows" number on the bottom. Otherwise you may have to calculate it in Excel and use a max calc to pull it in as shown.


          Depending on your data you may have to do a little rounding.

          • 2. Re: Adding up only the unique records (text)

            I was trying to put something in the caption that describtes the chart in a different way. So,I'm creating a chart that has systems and downtime on a moving 12 month period. So, in the caption I have dynamic text that says "There has been XXX hours of downtime across X number of systems in the last 12 months. So, It's the X that I'm trying to populate. If I use just count or sum then I get duplicates and not the unique number of systems.


            Now, this is just one way I'm using the data. There are a few others but all related.

            • 3. Re: Adding up only the unique records (text)
              Jonathan Drummey

              You can probably use the SIZE() table calculation to get that number, however the exact compute using would depend on your view.

              • 4. Re: Adding up only the unique records (text)
                Alex Kerin

                I always forget about size() despite the fact I use it fairly often

                • 5. Re: Adding up only the unique records (text)

                  Can someone elaborate on size? I have never used that function. Sounds like it only returns the number of records in that query.



                  • 6. Re: Adding up only the unique records (text)
                    Bryan Sheasby

                    CountD([fieldName]) should work with a live database connection however I see it does not work with a live excel connection.


                    You could do it with custom SQL. It would look kind of like this.


                    SELECT COUNT([Field1]) AS Field1Distinct

                    FROM [Sheet$] GROUP BY [Field1]


                    Then create a calc to count Field1Distinct in Tableau

                    • 7. Re: Re: Adding up only the unique records (text)
                      Alex Kerin

                      size is a table calculation, so the way that you set the partitioning changes the way it works. See some examples attached.

                      • 8. Re: Adding up only the unique records (text)
                        Jonathan Drummey

                        SIZE() is a table calculation. Table calculations are a suite of functions that are performed after data has been aggregated by the dimensions in the view. They let us perform aggregations of the aggregations, and aggregations at different levels, and can avoid the need for much (but not all) SQL hacking.


                        The table calculation documentation is full of references to "rows in the partition", but these are not necessarily rows of data, and that can be confusing. A "row in the partition" refers to an address created by the combination of values of dimension(s) used for addressing a table calculation. The addressing dimension(s) are assigned either by the Compute Using context menu, or within the table calculation dialogs.


                        So, if I have Department from Superstore Sales (Extract) as the only dimension in the view, and the Compute Using is set to Department, there is one partition and three addresses (or rows in the partition), even though the query is returning 16,798 records from the data source:


                        2013-08-12 11_25_14-Tableau - Book5.png


                        So, in your case, you could potentially use SIZE() addressing on the systems, but I can't say exactly without having more details on the configuration, such as a packaged workbook with sample data.



                        • 9. Re: Re: Adding up only the unique records (text)

                          Thanks. Here is a sample file I created. It has only a few records.

                          • 10. Re: Re: Re: Adding up only the unique records (text)
                            Jonathan Drummey



                            See the attached.


                            I'm presuming that for any given "Discovered Date & Time" there could be more than one System Service Name, if that's not the case then this would be a lot easier to set up. Given that assumption, this is one case where using Custom SQL or switching data sources (to a Tableau Data Extract, or something that supports COUNTD()) could be easier.


                            The reason why is that in order for the SIZE() table calc to work, then the System Service Name has to be in the view (on Rows or Columns). Since it's a dimension (that could return multiple results per date & time per the above assumption), that increases the level of detail in the view and all other calculations have to be altered to deal with that, so there are several calculations:


                            - # of Systems with formula SIZE(). This will have an Advanced Compute Using of System Service Name, Discovered Date & Time (in that order) At the Level System Service Name. This gets the total number of systems, and effectively ignores the Discovered Date & Time. (At the Level is more complicated than that, this is a simplification).

                            - Downtime for Display with formula IF FIRST()==0 THEN TOTAL(SUM([Downtime])) END. This has an Advanced Compute Using of Discovered Date & Time, System Service Name (in that order), Restarting Every Discovered Date & Time. Because System Service Name is in the view and increasing the level of detail, Tableau would be generating marks for all combinations of the date/system. To simplify that, we first need to get an accurate total per date (that's the TOTAL part) and reduce the number of results to just one per date (that's the IF FIRST()==0 part).

                            - Total Downtime with formula TOTAL(SUM([Downtime])). This has an Advanced Compute Using on Discovered Date & Time, System Service Name and generates a total downtime.

                            - Index with formula INDEX(). This has an Advanced Compute Using on Discovered Date & Time, System Service Name (in that order), At the Level Discovered Date & Time. Because we've got more detail in the view, we need to tell Tableau what order to draw the line segments in, the Index calc will end up on the Path Shelf.


                            You can see all the calcs in the workout tab. This is normally how I work with table calculations, putting all my dimensions on Rows and then creating a Measure Names/Values table to work out the calcs & the compute usings. Once I have that working, then I can duplicate the worksheet and drag the pills around to create the view.


                            In this case, the View worksheet has a couple of extra steps beyond just moving pills around:


                            - System Service Name moves to the level of detail shelf, so It's still in the view but not showing up. I also turn off the tooltip because this can be confusing.

                            - So does the blue pill (discrete) Discovered Date & Time. This is necessary because of the way At the Level behaves in Tableau, it starts getting wonky if we just use the green pill (continuous) date.

                            - Put a copy of Discovered Date & Time as a green pill on Columns, as in the original view.

                            - A copy of the Downtime for Display pill is on the Filters shelf to remove Null values. If you take it off the Filters Shelf, you will see a "40 nulls" warning in the lower right. This is because of Tableau's densification behavior, where Tableau is trying to think ahead and be ready to generate additional computations, in this case it's not needed.

                            - I put the # of Systems and Total Downtime in the title, they could also go in the caption.

                            Hope this helps!