-
1. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Jonathan DrummeyJun 3, 2014 2:07 PM (in response to Keith Helfrich)
Hi Keith,
This is pretty much what I was expecting. Two questions:
1) Are you looking to have the keyword dimension in the view (i.e. on Rows, Columns, Pages, or the Marks Card)?
2) Are you going to use Tableau's built-in Grand Totals?
Jonathan
-
2. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Keith Helfrich Jun 3, 2014 3:23 PM (in response to Jonathan Drummey)Hi Jonathan,
The answer to both questions is "no":
- neither is keyword in the view
- nor will grand-totals be used
One caveat, however, is that the number of keywords will be very large. We could expect at least 100,000 of them, perhaps many more. For this reason, among others, the duplicated dimension approach isn't very attractive (memory & performance problems from the Quick Filter).
Any thoughts on how to capture the user provided vector and send these values into R ?
Thanks!
-
3. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Jonathan DrummeyJun 4, 2014 4:57 AM (in response to Keith Helfrich)
Hi Keith,
I thought I had a brilliant solution using data blending last night, but in the cold light of the morning it's broken. That leads to several more questions, because there are still multiple other possibilities:
3) Is the current performance for a Quick Filter with 100K values acceptable?
4) How many teams at a time could theoretically be selected by this intersection filter? (I'm looking for an order of magnitude here, are you talking 10, 100, 1000, 1000000, etc.?)
5) Could you be ok with a multi-step process, like choosing the filter criteria and then entering a parameter for the number of criteria selected, or choosing the filter criteria, then using those results to drive a Filter Action?
6) What is the R code going to be doing?
6a) The deeper question is can the R code accept Null or NA values?
7) What is the goal for the final result?
Jonathan
-
4. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Robert Sutter Jun 4, 2014 5:43 AM (in response to Keith Helfrich)I found a way to do this if 2 are selected but got stumped after that.
I'll have to figure out a way to:
1. ignore this process if only one is selected
2. repeat this for every combination of keywords selected
I don't know if this will help get anyone there...
-
bocce_teams_INTERSECTION.twbx 54.3 KB
-
-
5. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Keith Helfrich Jun 4, 2014 10:56 AM (in response to Jonathan Drummey)Hi Johnathan Robert, thanks for digging deeper. This one is a nutshell to crack. And I'm happy to have companions helping me to think through the puzzle pieces!
To answer Jonathan's questions:
3) Is the current performance for a Quick Filter with 100K values acceptable?
- not really
- if the INTERSECTION logic were driven from a single quick-filter (idea 3375), then perhaps
- by duplicating the keyword dimension as a "hack-ey work-around" we're now loading an extra 100k words into memory for each additional keyword in the ven diagram
- requiring the intersection at least three keywords, this means 300k+ worth of quick-filter & Tableau chokes on memory
An Available Compromise
Granted, 100k+ is a LOT of keywords. And bigger data isn't always better. So if the size of the keywords dimension were the only blocker, then as a compromise we could trim it down based on value. e.g. limit the keyword dimension to:
- Top N keywords per bocce team (~1000)
- Only keywords in the top N quantile(s) / percentile for each team
- etc
100k of them is nice to have. It is holistic, but also has a very long tail. Should the volume of keywords be the only blocker, then I can scale them back by "value-add" until a healthy compromise between performance & functionality is found.
4) How many teams at a time could theoretically be selected by this intersection filter? (I'm looking for an order of magnitude here, are you talking 10, 100, 1000, 1000000, etc.?)
At the moment, the total number of bocce teams is in the small-hundreds. So, we're starting from a list of <500 teams and, by way of intersecting keywords, seeking to narrow those down to just a handful that meet the user's criteria.
Remember the importance of "Show only Relevant Values". If the users chooses keyword 1, then we would hope & expect the options for keyword 2 to be "only relevant values". Wildcard matching is another, less attractive, but perhaps viable option.
5) Could you be ok with a multi-step process, like choosing the filter criteria and then entering a parameter for the number of criteria selected, or choosing the filter criteria, then using those results to drive a Filter Action?
Sure! As long as it works & is practical. Which means: "enter a few words & press the submit button", Yea!
But of course, anything that's kludgey or irritating will be met with resistance.
To note: the very purpose of this dashboard is to narrow the list of bocce teams. So to that end, a various parameterized filters are already in play. That is: the dashboard already has a variety of parameter-driven filters whose purpose is to narrow the list of bocce teams. The aim now is to integrate this new ability to narrow by keyword, as well.
6) What is the R code going to be doing?
Compensating for the fact that idea 3375 isn't a reality =). Without the ability to send the user's vector of keywords into R: I've not thought it through completely yet. It doesn't make sense yet to invest a ton of time until I know that I can send the user's vector of keywords into R.
Robert's blocker is a good example of something I would address with a data manipulation & programming environment like R: the ability to restructure, loop, & perform complex logic on the data.
If not R, then the another option is the JavaScript API. I lean towards R only because I'm familiar and I've not yet worked with the JSAPI. I do know that some folks are using the JSAPI as a means to escape the QuickSand (quick filter) effect on performance. So if that's the route, then I'm definitely interested to learn!
One advantage of the JSAPI: processing is done client-side in the browser.
Loosely, I envision the R-Integration would look something like this:
- send the user's vector of keywords into R
- perform the INTERSECTION logic
- return the list of bocce teams that meet the INTERSECTION criteria
Given that the Tableau-R integration works as "vector in vector out", I would likely structure the response from R as a single string which gets repeated in the response for every row in the table. I imagine that string to be composed of the "concatenated list of bocce teams which meet the user's keywords criteria". And from here, a boolean filter in the dashboard should doo the trick as follows:
[response from R] CONTAINS [bocce name]
That would be one way to go about it.
6a) The deeper question is can the R code accept Null or NA values?
Sure! If it has to. Performance wise, it seems wasteful (perhaps prohibitive ?) to pass extremely long vectors into R composed mostly of NA's. But if a few NA's exist, that's perfectly fine. The first step in R can be to simply:
na.omit([vector received from Tableau])
7) What is the goal for the final result?
The goal is precisely what we're working with in the attached bocce_teams_INTERSECTION.twbx. We want to begin with a list of <500 bocce teams and narrow that list down to just a handful, based on the INTERSECTION of keywords they have in common.
Beyond performance and practicality, the only one obligatory criteria is to integrate this new functionality into a dashboard / data source that already performs similar "narrowing down" with parameters built upon other, quantitative values.
Wow! That was a lengthy response to your questioning. And they are good questions!
Any & all input is greatly appreciated!
THANK YOU!
Thanks!
-
6. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Jonathan DrummeyJun 4, 2014 11:07 AM (in response to Keith Helfrich)
Thanks for the answers...it sounds like the use of R code is just to do the
intersection, correct? If so, I've come up with a solution that does the
intersection totally in Tableau, though I won't get a chance to apply it to
your code until late tonight or tomorrow.
On Wed, Jun 4, 2014 at 1:50 PM, Keith Helfrich <
-
7. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Keith Helfrich Jun 4, 2014 11:57 AM (in response to Jonathan Drummey)epa! That's great. Yes, the call to R would be to filter the intersection (nothing more). If it can be done in Tableau & performs + works well, then super!
-
8. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Jonathan DrummeyJun 4, 2014 1:35 PM (in response to Keith Helfrich)
Hi Keith,
Had a few minutes before my next meeting, here's a solution with a short
explanation, I've got plans to write more text about other paths I
explored. Here's how I broke down the problem:
1) Select some keywords. The ideal would be something like Tableau's
multi-select Quick Filter, so why not use that?
2) Return the teams that have any the keywords. This is what Tableau's
Quick Filters do. One thing to know here is that the Tableau Quick Filter
only select keywords that are actually in the data.
3) Count how many keywords the team had (of the selected set of keywords).
This can be done using a data source that supports COUNTD using
COUNTD(keyword), that's the # of Keywords for Team Measure. I extracted the
data for this.
4) Count the total number of keywords selected. Because of #2, we know that
if we can get the distinct count of all keywords across every team, that
will be the total number of keywords selected. We get this using
TOTAL(COUNTD(keyword)) with a Compute Using on the bocce_team, the calc is
"# of Keywords Selected".
5) Create an Intersection Filter calc that is >=
. Here's the workout view for that:
6) Put that calc on the Filters Shelf. Here's the view:
And, just for fun, I set up visual dive in a dashboard. In this view
there's a treemap with a mark for each keyword. Users can click on one or
more keywords. This triggers a Filter Action to the other two worksheets,
one of them is set up as above, the other is using a variation of the
"Making a String List" calculation from
http://drawingwithnumbers.artisart.org/the-next-n-table-calculation/ to
make a list of all selections. I did a simple coloring of the treemap based
on the first letter of each keyword, there are many other things you could
do with this.
Let me know if you have any questions!
Jonathan
PS: The workbook was in Tableau 8.0. If you can, I'd suggest upgrading to
8.1, besides the announced features there were a bunch of performance
enhancements, including around quick filters, that might help you out.
On Wed, Jun 4, 2014 at 2:57 PM, Keith Helfrich <
-
bocce_teams_INTERSECTION jtd.twbx 153.7 KB
-
image.png 47.7 KB
-
image.png 23.1 KB
-
image.png 95.2 KB
-
-
9. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Jonathan DrummeyJun 4, 2014 1:43 PM (in response to Keith Helfrich)
One more thing:
The way "Only relevant values" works, it's dependent on other filter
selection and application. The only way I can think of have only relevant
values would be to use the multiple keyword dimensions that you've ruled
out due to performance issues. If you were using a smaller set of keywords,
then you might be able to use the multiple keyword dimensions and have
acceptable performance. Depending on what you're trying to do, the kind of
view that I set up with the treemap might be an alternative, where a set of
views could be nested with filter actions going from one to the next.
Jonathan
On Wed, Jun 4, 2014 at 4:35 PM, Jonathan Drummey <jonathan.drummey@gmail.com
-
10. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Ville Tyrväinen Jun 4, 2014 1:43 PM (in response to Jonathan Drummey)Very nice Jonathan
-
11. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Keith Helfrich Jun 4, 2014 4:13 PM (in response to Ville Tyrväinen)Yes. Very nice work. Thank you! This is a good demonstration of what it takes to be an expert in Tableau: the secret sauce is the ability to dissect the data available, using the tools available to build new data components which themselves can be used as tools, and then building upon those to reach your objective.
i.e. "often"
- you can do anything in Tableau
- when you think like MacGyver =)
Thanks, Jonathan! I'm looking forward to catching up on my reading in the articles you've linked to, and also putting this into play on the larger data set to see how it performs. Should anything interesting come up, I will certainly let you know.
Thanks!
-
12. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Jonathan DrummeyJun 5, 2014 6:10 AM (in response to Keith Helfrich)
You're welcome!
You touch on a subject that I've been thinking a lot about: A continuum of
working with Tableau is exploring vs. composing. On the exploratory side,
we’re dragging and dropping pills to see what happens. On the composing
side, we have a worksheet or dashboard in mind and are trying to work with
Tableau to reach the desired result. When composing, I find that I have a
different workflow:
1. Understand the goal.
2. Understand the data.
3. Conceive of how to best arrange the data & view to meet the goal (given
my knowledge of Tableau and working with data).
4. Do whatever data transformation that needs to be done outside of
Tableau, and build whatever calculated fields need to be build in Tableau.
5. Drag and drop pills to build the view.
So where an exploratory viz often starts with dragging and dropping pills,
a composed viz starts with more research and ends with dragging and
dropping pills. My forum and blog posts and the book I'm writing are very
much about helping Tableau users grow their knowledge of #3.
Jonathan
On Wed, Jun 4, 2014 at 7:13 PM, Keith Helfrich <
-
13. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Keith Helfrich Jun 5, 2014 5:05 PM (in response to Jonathan Drummey)Hi Jonathan,
It's true. The exploring vs. composing continuum is a source of disappointment for many. They get their hopes up & are later disillusioned by how complex Tableau work can really be. You've described the scenario very acutely with your post about the letdown. It's one I've referenced often with clients, and is a scenario I've observed many times (in myself, and in others). The frustrations are real.
One note about the solution you've provided so far, I do find one small problem. The logic breaks with no keywords in the filter. When there's no filter, then we would expect the dashboard to show all teams.
But instead of going to zero, the calculation for [# of Keywords Selected] actually skyrockets upward to the total distinct count of all keywords across the data set. And this, in turn, breaks the [INTERSECTION] calculation. Result being: no teams are shown.
I'm only now taking a look to see what can be done, but it occurs to me the answer may be to produce yet another boolean to evaluate whether [# of Keywords Selected] is equal to TOTAL(COUNTD([keyword])) across the entire population (regardless of any filtering). And to then combine this new BOOLEAN with the INTERSECTION to drive the filtering.
To reach this new boolean I'm thinking of, it is necessary to maintain percent of total whilst filtering, for which I often choose the duplicated data source approach. It's do-able, but clunky. So I wonder: can you think of any other way to make the vis work correctly when there's no filter at all ?
Thanks!
-
14. Re: Brainstorming Multi-Select Filters: INTERSECTION instead of UNION
Pankaj Chopra Jun 6, 2014 8:58 AM (in response to Keith Helfrich)Hi Guys
First of all thanks a lot for this post. Its an awesome post and solution for what looks like a simple problem.
But I have a follow up problem. Lets say we also add another field to the original bocce_teams file called Category and each booce_team has a category that is its a one to one relationship. I added a column Category to the excel file. You can consider Category as one of the attributes of a team for instance.
Now I not only want to select multiple keywords ( get INTERSECTION result) but I also want to add a Category value quick filter ( select one value or multiple values ) - note that in this case I want the filter to behave like an OR - It cannot be an AND anyways since its a one to one relationship with the team.
So for the solution posted which resulted in 6 teams team for keywords 3d and awesome
it should only show me those 6 categories ( when i hit relevant values) corresponding to the teams which I obtained but it doesnt. This confuses the end user.
Any ideas on how to incorporate this?
-
bocce_teams_INTERSECTION jtd.twbx 137.0 KB
-
Test file.csv.zip 55.5 KB
-
bocce_teams_INTERSECTION jtd.twb 162.1 KB
-