1 2 Previous Next 18 Replies Latest reply on Jun 9, 2014 2:05 PM by Keith Helfrich

    Brainstorming Multi-Select Filters: INTERSECTION instead of UNION

    Keith Helfrich

      A forum topic for brainstorming the various work-arounds that might be possible until the day that Tableau implements Idea #3375:

       

      Multi-Select Filters: INTERSECTION instead of UNION

       

      My new foray into making this work would hopefully be via R-Integration.  Can you think of a way to capture the vector of user-defined entries from a multi-select quick-filter and send these values into R ?

       

      Here is a sample data set. It simply maps fictitious bocce ball teams to keywords, each with a word count.  INDEX() is there, to show how many teams meet the quick filter criteria.

       

      The Problem:

      1. Begin by filtering for the keyword "3d"

      • Eleven bocce ball teams have a count for this keyword.

      2. Add "awesome" to the multi-select filter

      • Now the list grows to 13 (UNION logic)

       

      Objective:

      • Instead, we want for the list to shrink to 6 (INTERSECTION logic)

       


      I'll be curious to see what you come up with!

      Thanks!

      Keith Helfrich | LinkedIn

      Keith Helfrich | Twitter

        • 1. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
          Jonathan Drummey

          Hi Keith,

           

          This is pretty much what I was expecting. Two questions:

           

          1) Are you looking to have the keyword dimension in the view (i.e. on Rows, Columns, Pages, or the Marks Card)?

          2) Are you going to use Tableau's built-in Grand Totals?

           

          Jonathan

          • 2. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
            Keith Helfrich

            Hi Jonathan,

             

            The answer to both questions is "no":

              - neither is keyword in the view

              - nor will grand-totals be used

             

            One caveat, however, is that the number of keywords will be very large.  We could expect at least 100,000 of them, perhaps many more.  For this reason, among others, the duplicated dimension approach isn't very attractive (memory & performance problems from the Quick Filter).

             

            Any thoughts on how to capture the user provided vector and send these values into R ?

             

            Thanks!

            Keith Helfrich | LinkedIn

            Keith Helfrich | Twitter

            • 3. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
              Jonathan Drummey

              Hi Keith,

               

              I thought I had a brilliant solution using data blending last night, but in the cold light of the morning it's broken. That leads to several more questions, because there are still multiple other possibilities:

               

              3) Is the current performance for a Quick Filter with 100K values acceptable?

               

              4) How many teams at a time could theoretically be selected by this intersection filter? (I'm looking for an order of magnitude here, are you talking 10, 100, 1000, 1000000, etc.?)

               

              5) Could you be ok with a multi-step process, like choosing the filter criteria and then entering a parameter for the number of criteria selected, or choosing the filter criteria, then using those results to drive a Filter Action?

               

              6) What is the R code going to be doing?

              6a) The deeper question is can the R code accept Null or NA values?

               

              7) What is the goal for the final result?

               

              Jonathan

              • 4. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                Robert Sutter

                I found a way to do this if 2 are selected but got stumped after that.

                 

                I'll have to figure out a way to:

                 

                1. ignore this process if only one is selected

                2. repeat this for every combination of keywords selected

                 

                I don't know if this will help get anyone there...

                • 5. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                  Keith Helfrich

                  Hi Johnathan Robert, thanks for digging deeper.  This one is a nutshell to crack.  And I'm happy to have companions helping me to think through the puzzle pieces!

                   

                  To answer Jonathan's questions:

                   

                  3) Is the current performance for a Quick Filter with 100K values acceptable?

                  - not really

                  - if the INTERSECTION logic were driven from a single quick-filter (idea 3375), then perhaps

                   

                  - by duplicating the keyword dimension as a "hack-ey work-around" we're now loading an extra 100k words into memory for each additional keyword in the ven diagram

                  - requiring the intersection at least three keywords, this means 300k+ worth of quick-filter & Tableau chokes on memory

                   

                  An Available Compromise

                  Granted, 100k+ is a LOT of keywords.  And bigger data isn't always better.  So if the size of the keywords dimension were the only blocker, then as a compromise we could trim it down based on value.  e.g. limit the keyword dimension to:

                  - Top N keywords per bocce team (~1000)

                  - Only keywords in the top N quantile(s) / percentile for each team

                  - etc

                   

                  100k of them is nice to have.  It is holistic, but also has a very long tail.  Should the volume of keywords be the only blocker, then I can scale them back by "value-add" until a healthy compromise between performance & functionality is found.

                   

                  4) How many teams at a time could theoretically be selected by this intersection filter? (I'm looking for an order of magnitude here, are you talking 10, 100, 1000, 1000000, etc.?)

                   

                  At the moment, the total number of bocce teams is in the small-hundreds.  So, we're starting from a list of <500 teams and, by way of intersecting keywords, seeking to narrow those down to just a handful that meet the user's criteria.

                   

                  Remember the importance of "Show only Relevant Values".  If the users chooses keyword 1, then we would hope & expect the options for keyword 2 to be "only relevant values".  Wildcard matching is another, less attractive, but perhaps viable option.

                   

                  5) Could you be ok with a multi-step process, like choosing the filter criteria and then entering a parameter for the number of criteria selected, or choosing the filter criteria, then using those results to drive a Filter Action?

                   

                  Sure!  As long as it works & is practical.  Which means: "enter a few words & press the submit button",  Yea! 


                  But of course, anything that's kludgey or irritating will be met with resistance.

                   

                  To note: the very purpose of this dashboard is to narrow the list of bocce teams.  So to that end, a various parameterized filters are already in play.  That is: the dashboard  already has a variety of parameter-driven filters whose purpose is to narrow the list of bocce teams.  The aim now is to integrate this new ability to narrow by keyword, as well.

                   

                  6) What is the R code going to be doing?

                   

                  Compensating for the fact that idea 3375 isn't a reality =).  Without the ability to send the user's vector of keywords into R: I've not thought it through completely yet.  It doesn't make sense yet to invest a ton of time until I know that I can send the user's vector of keywords into R.

                   

                  Robert's blocker is a good example of something I would address with a data manipulation & programming environment like R: the ability to restructure, loop, & perform complex logic on the data.

                   

                  If not R, then the another option is the JavaScript API.  I lean towards R only because I'm familiar and I've not yet worked with the JSAPI.  I do know that some folks are using the JSAPI as a means to escape the QuickSand (quick filter) effect on performance.  So if that's the route, then I'm definitely interested to learn!

                   

                  One advantage of the JSAPI: processing is done client-side in the browser.

                   

                  Loosely, I envision the R-Integration would look something like this:

                    - send the user's vector of keywords into R

                    - perform the INTERSECTION logic

                    - return the list of bocce teams that meet the INTERSECTION criteria

                   

                  Given that the Tableau-R integration works as "vector in vector out", I would likely structure the response from R as a single string which gets repeated in the response for every row in the table.  I imagine that string to be composed of the "concatenated list of bocce teams which meet the user's keywords criteria".  And from here, a boolean filter in the dashboard should doo the trick as follows:

                   

                      [response from R] CONTAINS [bocce name]

                   

                  That would be one way to go about it.

                   

                  6a) The deeper question is can the R code accept Null or NA values?

                   

                  Sure!  If it has to.  Performance wise, it seems wasteful (perhaps prohibitive ?) to pass extremely long vectors into R composed mostly of NA's.  But if a few NA's exist, that's perfectly fine.  The first step in R can be to simply:


                  na.omit([vector received from Tableau])

                   

                  7) What is the goal for the final result?

                   

                  The goal is precisely what we're working with in the attached bocce_teams_INTERSECTION.twbx.  We want to begin with a list of <500 bocce teams and narrow that list down to just a handful, based on the INTERSECTION of keywords they have in common.

                   

                  Beyond performance and practicality, the only one obligatory criteria is to integrate this new functionality into a dashboard / data source that already performs similar "narrowing down" with parameters built upon other, quantitative values.

                   

                  Wow!  That was a lengthy response to your questioning.  And they are good questions!


                  Any & all input is greatly appreciated!

                  THANK YOU!

                   

                  Thanks!

                  Keith Helfrich | LinkedIn

                  Keith Helfrich | Twitter

                  • 6. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                    Jonathan Drummey

                    Thanks for the answers...it sounds like the use of R code is just to do the

                    intersection, correct? If so, I've come up with a solution that does the

                    intersection totally in Tableau, though I won't get a chance to apply it to

                    your code until late tonight or tomorrow.

                     

                     

                    On Wed, Jun 4, 2014 at 1:50 PM, Keith Helfrich <

                    • 7. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                      Keith Helfrich

                      epa!  That's great.  Yes, the call to R would be to filter the intersection (nothing more).  If it can be done in Tableau & performs + works well, then super!

                      • 8. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                        Jonathan Drummey

                        Hi Keith,

                         

                        Had a few minutes before my next meeting, here's a solution with a short

                        explanation, I've got plans to write more text about other paths I

                        explored. Here's how I broke down the problem:

                         

                        1) Select some keywords. The ideal would be something like Tableau's

                        multi-select Quick Filter, so why not use that?

                        2) Return the teams that have any the keywords. This is what Tableau's

                        Quick Filters do. One thing to know here is that the Tableau Quick Filter

                        only select keywords that are actually in the data.

                        3) Count how many keywords the team had (of the selected set of keywords).

                        This can be done using a data source that supports COUNTD using

                        COUNTD(keyword), that's the # of Keywords for Team Measure. I extracted the

                        data for this.

                        4) Count the total number of keywords selected. Because of #2, we know that

                        if we can get the distinct count of all keywords across every team, that

                        will be the total number of keywords selected. We get this using

                        TOTAL(COUNTD(keyword)) with a Compute Using on the bocce_team, the calc is

                        "# of Keywords Selected".

                        5) Create an Intersection Filter calc that is  >=

                        . Here's the workout view for that:

                         

                        6) Put that calc on the Filters Shelf. Here's the view:

                         

                         

                         

                         

                        And, just for fun, I set up visual dive in a dashboard. In this view

                        there's a treemap with a mark for each keyword. Users can click on one or

                        more keywords. This triggers a Filter Action to the other two worksheets,

                        one of them is set up as above, the other is using a variation of the

                        "Making a String List" calculation from

                        http://drawingwithnumbers.artisart.org/the-next-n-table-calculation/ to

                        make a list of all selections. I did a simple coloring of the treemap based

                        on the first letter of each keyword, there are many other things you could

                        do with this.

                         

                         

                         

                        Let me know if you have any questions!

                         

                        Jonathan

                         

                         

                        PS: The workbook was in Tableau 8.0. If you can, I'd suggest upgrading to

                        8.1, besides the announced features there were a bunch of performance

                        enhancements, including around quick filters, that might help you out.

                         

                         

                         

                        On Wed, Jun 4, 2014 at 2:57 PM, Keith Helfrich <

                        • 9. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                          Jonathan Drummey

                          One more thing:

                           

                          The way "Only relevant values" works, it's dependent on other filter

                          selection and application. The only way I can think of have only relevant

                          values would be to use the multiple keyword dimensions that you've ruled

                          out due to performance issues. If you were using a smaller set of keywords,

                          then you might be able to use the multiple keyword dimensions and have

                          acceptable performance. Depending on what you're trying to do, the kind of

                          view that I set up with the treemap might be an alternative, where a set of

                          views could be nested with filter actions going from one to the next.

                           

                          Jonathan

                           

                           

                           

                          On Wed, Jun 4, 2014 at 4:35 PM, Jonathan Drummey <jonathan.drummey@gmail.com

                          • 11. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                            Keith Helfrich

                            Yes.  Very  nice work.  Thank you!  This is a good demonstration of what it takes to be an expert in Tableau: the secret sauce is the ability to dissect the data available, using the tools available to build new data components which themselves can be used as tools, and then building upon those to reach your objective.

                             

                            i.e. "often"

                            - you can do anything in Tableau

                            - when you think like MacGyver =)

                             

                            Thanks, Jonathan!  I'm looking forward to catching up on my reading in the articles you've linked to, and also putting this into play on the larger data set to see how it performs.  Should anything interesting come up, I will certainly let you know.

                             

                            Thanks!

                            Keith Helfrich | LinkedIn

                            Keith Helfrich | Twitter

                            • 12. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                              Jonathan Drummey

                              You're welcome!

                               

                              You touch on a subject that I've been thinking a lot about: A continuum of

                              working with Tableau is exploring vs. composing. On the exploratory side,

                              we’re dragging and dropping pills to see what happens. On the composing

                              side, we have a worksheet or dashboard in mind and are trying to work with

                              Tableau to reach the desired result. When composing, I find that I have a

                              different workflow:

                               

                              1. Understand the goal.

                              2. Understand the data.

                              3. Conceive of how to best arrange the data & view to meet the goal (given

                              my knowledge of Tableau and working with data).

                              4. Do whatever data transformation that needs to be done outside of

                              Tableau, and build whatever calculated fields need to be build in Tableau.

                              5. Drag and drop pills to build the view.

                               

                              So where an exploratory viz often starts with dragging and dropping pills,

                              a composed viz starts with more research and ends with dragging and

                              dropping pills. My forum and blog posts and the book I'm writing are very

                              much about helping Tableau users grow their knowledge of #3.

                               

                              Jonathan

                               

                               

                               

                               

                              On Wed, Jun 4, 2014 at 7:13 PM, Keith Helfrich <

                              • 13. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                                Keith Helfrich

                                Hi Jonathan,

                                 

                                It's true.  The exploring vs. composing continuum is a source of disappointment for many.  They get their hopes up & are later disillusioned by how complex Tableau work can really be.  You've described the scenario very acutely with your post about the letdown.  It's one I've referenced often with clients, and is a scenario I've observed many times (in myself, and in others).  The frustrations are real.

                                 

                                One note about the solution you've provided so far, I do find one small problem.  The logic breaks with no keywords in the filter.  When there's no filter, then we would expect the dashboard to show all teams.

                                 

                                But instead of going to zero, the calculation for [# of Keywords Selected] actually skyrockets upward to the total distinct count of all keywords across the data set.  And this, in turn, breaks the [INTERSECTION] calculation.  Result being: no teams are shown.

                                 

                                I'm only now taking a look to see what can be done, but it occurs to me the answer may be to produce yet another boolean to evaluate whether [# of Keywords Selected] is equal to TOTAL(COUNTD([keyword])) across the entire population (regardless of any filtering).  And to then combine this new BOOLEAN with the INTERSECTION to drive the filtering.

                                 

                                To reach this new boolean I'm thinking of, it is necessary to maintain percent of total whilst filtering, for which I often choose the duplicated data source approach.  It's do-able, but clunky.  So I wonder: can you think of any other way to make the vis work correctly when there's no filter at all ?

                                 

                                Thanks!

                                Keith Helfrich | LinkedIn

                                Keith Helfrich | Twitter

                                • 14. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
                                  Pankaj Chopra

                                  Hi Guys

                                   

                                  First of all thanks a lot for this post. Its an awesome post and solution for what looks like a simple problem.

                                   

                                  But I have a follow up problem. Lets say we also add another field to the original bocce_teams file called Category and each booce_team has a category that is its a one to one relationship. I added a column Category to the excel file. You can consider Category as one of the attributes of a team for instance.

                                   

                                  Now I not only want to select multiple keywords ( get INTERSECTION result) but I also want to add a Category value quick filter (  select one value or multiple values ) - note that in this case I want the filter to behave like an OR - It cannot be an AND anyways since its a one to one relationship with the team.

                                   

                                  So for the solution posted which resulted in 6 teams team for keywords 3d and awesome

                                  it should only show me those 6 categories ( when i hit relevant values) corresponding to the teams which I obtained but it doesnt. This confuses the end user.

                                   

                                  Any ideas on how to incorporate this?

                                  1 2 Previous Next