1 2 Previous Next 27 Replies Latest reply on Oct 26, 2016 4:40 PM by Dipanjan Sen

    Sorting Blended Data

    Greg Paul

      I have a primary data source and secondary the are linked by sap id.  The secondary source has location information (like state) and the primary source has the measure I'm interested in.  It doesn't appear to be possible to sort by the measure in this case (the sort buttons are disabled).  Is there a way to do it while still using a blend? 

       

      Thanks,
      Greg

        • 1. Re: Sorting Blended Data
          Jonathan Drummey

          This is hack-y, but it works: Create a calculated field based on the measure that would return the right alphanumeric sort, such as -SUM([Sales]) for a descending sum of Sales, then put that as a Discrete (blue) pill to the left of the dimension you want to sort, and finally turn off Show Headers for the -SUM([Sales]) header.

           

          Jonathan

          12 of 12 people found this helpful
          • 2. Re: Sorting Blended Data
            Greg Paul

            Creative, nice.  Points for you!

            • 3. Re: Re: Sorting Blended Data
              Zach Leber

              My primary data source contains sales quantities and my blended secondary data source contains a list of prices. I want to sort by total sales value, where value = quantity * price. I had given up until I discovered this thread yesterday, which I extended a bit in the attached example. Thank you Jonathan for the technique which Tableau is now recommending in this quick fix article: Sort Options Not Available from Toolbar When Data Blending | Tableau Software. There are still plenty of limitations, so I've voted to make this a fully supported feature: Secondary datasets: Sets, Rank, Sort, Polygon/Line maps, Lat/Long (generated), and relationships between multiple secondary data connections.

              blended sort.png

              • 4. Re: Re: Re: Sorting Blended Data
                Jonathan Drummey

                Hi Zach,

                 

                I'm glad the technique was useful, and I also wish we didn't need it!

                 

                FYI, in the workbook you posted the Sales (sort) function was not correctly adding the sales for Bill together, so his sort value was coming out as -1000 and not -2000. This is because Model is on the Color Shelf, so it was due to the data set that both of Bill's sales totaled to 1000. To get the -2000 need to use a table calc to deal with the increased level of detail in the view. That can be done by wrapping the Sales calc in a WINDOW_SUM() with a Compute Using of Model. I set this up in the attached 8.1 workbook.

                 

                In some cases, the process of creating a sorted view can be easier in Tableau 8.1 using the RANK function, which can take measures from secondary sources and even table calculations as arguments. However, that doesn't really help in this case because you already need a table calc to get the sort, and nesting table calcs doesn't make life simpler.

                 

                Jonathan

                • 5. Re: Re: Re: Re: Sorting Blended Data
                  althea.chia

                  Hi Jonathan,

                   

                  I've been trying to use the rank function, but I'm having trouble letting it allow me to sort properly. Do you mind taking a look at this sample book that uses dummy data?

                   

                  Basically, I have two sources of data: the primary one has sales and order date, while the secondary has metropolitan area name, and these two sources are linked by the zip code. For the output, I want sales broken down by year and metro area. I was able to figure that out, but I can't seem to sort by sales WITHIN each respective year for the metro areas WITHIN the table. Rather, what happens is that it creates new rows with my data underneath the existing table.

                   

                  I'm a Tableau new-bie, but this part really frustrates me because if this were in Excel I could just sort on each individual year really easily. I've attached a packaged workbook - please take a look!

                   

                  Thanks!

                  Althea

                  • 6. Re: Re: Re: Re: Re: Sorting Blended Data
                    Jonathan Drummey

                    Hi Althea,

                     

                    Excel lets us do all sorts of things that Tableau has a harder time doing, and vice versa, so experiencing some frustration is natural as you're learning Tableau.

                     

                    You'd said in your original post that you wanted to sort sales within each respective year, but didn't say how you would want that view to look, and then in the provided workbook you had a parameter to choose the year and a crosstab. So I went with the workbook.

                     

                    The fundamental issue here is that Tableau won't give us an option to quick sort when the view is using the blended CBSA Name field, so we have to build our own ranking as a discrete field for sorting the view.

                     

                    In a sense, you are correct that Tableau is adding extra data, it's creating a cell for every combination of CBSA Name and Year due to the pill layout, this is known as domain completion. When building our own ranking, we have to make sure the discrete field is returning the same value for every combination of CBSA Name and Year, or else that discrete will mess up the view. Here's an example, using a screenshot of your workbook with Rank and CBSA Name swapped, so we can see that the Rank is returning the correct rank for the chosen year for those CBSA names that have values, and Null for CBSA/Year combinations that don't exist in the data:

                     

                    Screen Shot 2014-01-25 at 8.17.58 AM.PNG.png

                     

                    So how do we get the Ranking to return only 4 for LA-Long Beach? If we just filter out the Null values for Rank, that won't work because we'll get rid of all values except for the selected year:

                     

                    Screen Shot 2014-01-25 at 8.21.41 AM.PNG.png

                     

                    The rank needs to be padded out to make this work, to return the same Rank for every Year for each CBSA. Here's how I went about this:

                     

                    1) Changed the original Sort Field into the following calc that has a compute using on the Order Date, so it partitions for each CBSA.

                     

                    PREVIOUS_VALUE(WINDOW_MAX(SUM(IF YEAR([Order Date]) == [Year Parameter] THEN
                        [Sales]
                    ELSE
                        0
                    END)))
                    

                     

                    The IF statement returns the sales for the selected Year or 0. The 0 is necessary because the RANK functions only work with non-Null values, and I'm imagining that you want to make sure the CBSAs with no sales are at the bottom of the list. With the Compute Using on the Order Date, the WINDOW_MAX will then return that Sum of Sales for the selected Year to every Year for each CBSA. The PREVIOUS_VALUE is an optimization so this calc is only computed once for each CBSA.

                     

                    2) Then, that calc is used as a nested table calc within the Rank calculation. The Rank calculation has an Advanced Compute Using on the CBSA, with the nested Sort Field still having the Compute Using on Order Date:

                     

                    Screen Shot 2014-01-25 at 8.54.11 AM.PNG.png

                     

                    This is one route to get to this result, there are others I can conceive of that wouldn't necessarily require table calculations but would involve duplicated data sources and more complicated filtering.

                     

                    Does this answer your question?

                     

                    Jonathan

                    • 7. Re: Sorting Blended Data
                      Shawn Wallwork

                      Great explanation Jonathan!

                       

                      --Shawn

                       

                      (And a perfect example of what is 'wrong' with table calculations, in my opinion.)

                      • 8. Re: Sorting Blended Data
                        Jonathan Drummey

                        Thanks!

                         

                        Can you describe more about what you see as "wrong"? I'm curious.

                         

                        Jonathan

                        • 9. Re: Sorting Blended Data
                          Shawn Wallwork

                          I suppose it's a matter of complexity and opacity. Guys like you can squeeze almost anything out of Table Calculations using/understanding 'domain completion', 'domain padding', 'nested table calculations', 'performance tuning', etc. But for us mere mortals it is incredibly frustrating to trail & error our way through the almost infinite possible combinations just to get to an acceptable sort. Your detailed explanations over the years have certainly helped me cut down on the trail and error cycles, so thanks for that, but I am constantly feeling like "there has to be a better way".

                           

                          On the one hand Tableau is an incredibly easy tool to use, allowing the non-technical, non-programmers, non-analysis to explore their data and gain useful insights. Then these same people want to do something 'simple' like a sort, and bang they hit the Table Calculation brick wall -- and, as you're fond of saying, I wonder if the "juice is really worth the squeeze".

                           

                          I suspect Tableau has painted themselves into the proverbial corner when it comes to Table Calcs and there's just no way to back out now. Hope I'm wrong about this, a simple-to-use, simple-to-understand user-interface for Table Calcs would jump Tableau light years ahead.

                           

                          That's what I meant by 'wrong'.

                           

                          --Shawn

                          • 10. Re: Re: Re: Re: Re: Sorting Blended Data
                            althea.chia

                            Thank you so much Jonathan!

                             

                            I will try this solution later today and see if it works on my data, but from the screenshots you provided it looks like it will do the trick. Two questions: is there a difference in using = vs. == in your calculated field? And from perusing the forums for a sort solution, I came upon one where someone suggested creating a year parameter to sort on the year. Do you think the parameter was necessary for the solution, or was there a simpler way to go about this?

                             

                            Althea

                            • 11. Re: Re: Re: Re: Re: Sorting Blended Data
                              Jonathan Drummey

                              Hi Althea,

                               

                              In Tableau, == and = are equivalent. In some languages, == is the comparison operator, i.e. "Does a equal b?", and = is the assignment operator, i.e. "Set a equal to b". That's what I first learned, so I just stick with that.

                               

                              As for the parameter, it depends on the desired view. If you want to display all cities/all years and sort cities by a chosen year, then you're going to need some user-selected city and the three ways I know of to get that are to use a parameter, a blended data source with a filter, or a data source with a cross product. If you want to do something like a heatmap ranking, then you don't need the parameter at all.

                               

                              Jonathan

                              • 12. Re: Re: Re: Re: Re: Sorting Blended Data
                                Jonathan Drummey

                                Hi Shawn,

                                 

                                I really do agree with your point about table calcs being too hard. And (and I really mean "and" instead of "but"), the fundamental issue in this case is one of sorting, and we can't lose sight of that.

                                 

                                Speaking of which, one potential solution to this that I initially missed is that if the data sources could be combined into one, then the sorting issue would go away and the table calcs wouldn't need to be used at all because the CBSA Name could be sorted on a regular aggregate calculation like:

                                 

                                SUM(IF YEAR([Order Date])==[Year Parameter] THEN [Sales] ELSE 0 END)

                                 

                                The issue that required the nested table calc is a case of sparse data (partially generated by the original Sort Field calc), the dimensions in the view, and how Tableau lays out crosstabs. This is basic nuts and bolts of how Tableau works that is not well-covered in Tableau training or documentation (yet, I'm trying to write it). What I've been (slowly) realizing, and this thread has helped me to a deeper level of understanding, is just how foundational those "nuts and bolts"*** are. What lets others and me do "wizardry" with table calcs is that (at varying levels) we either consciously or intuititively "get" what Tableau is doing and can then manipulate Tableau's output as we need.


                                I might be overly optimistic, but I do think that "Table Calculation brick wall" that you referred to can be more of a "staircase" if I/we take some time to create ways to transmit the understanding that folks like me have.


                                Jonathan



                                *** By "nuts and bolts" I mean: understanding granularity of the data and of the view, the impact of dimensions and measures on the view, what is meant by "the view", how Tableau lays out crosstabs, sparse data, the basics of Tableau's order of operations, the four pill types (the combinations of dimension/measure and discrete/continuous), and the three levels of calculation (row-level, aggregate, and table calc).

                                • 13. Re: Re: Re: Re: Re: Sorting Blended Data
                                  althea.chia

                                  Thanks again for the clarification Jonathan!

                                   

                                  I finally came around to implementing your suggestions and they worked perfectly! Now I'm trying to rank and sort across year and CBSA, but instead of using the sum of sales, I wanted to do it based on the distinct count of unique sellers in each CBSA. When I replaced the "SUM" in the Sort FIeld formula with COUNTD, I had to remove the "ELSE 0" in order for the calc field to be valid (or it gives the error "Expected Type String, Found Integer. Result types from "IF" statements much match). However, this no longer replaced the NULLs w/ 0 and that means the cells w/ NULLs are now at the top of the table instead of the bottom.

                                   

                                  Here's the original calculation:

                                  PREVIOUS_VALUE(WINDOW_MAX(SUM(IF YEAR([Order Date]) == [Year Parameter] THEN

                                      [Sales]

                                  ELSE

                                      0

                                  END)))

                                   

                                  Do you have any way to fix this? Thanks for all your help!

                                  • 14. Re: Re: Re: Re: Re: Sorting Blended Data
                                    Jonathan Drummey

                                    What dimension are you trying to count, exactly? If you're doing a COUNTD() of Sales, that's only going to count the distinct values of of the Sales measure.

                                    1 2 Previous Next