This is really interesting, and I hope someone comes along who knows more about this than I do so I can learn something.
Unfortunately, my guess is that Tableau can't automatically identify what words show up most frequently in comments. However, if you already know a few words you are interested in monitoring you could make a calculated field like:
IF CONTAINS([Comments], "great")
ELSEIF CONTAINS([Comments], "fast")
ELSEIF CONTAINS([Comments], "slow")
Then you could drag this calculated field to Label, drag Number of Records to Size and set the mark type to Text. The only downside to this approach is that a comment like "Great service, very fast" would only be counted as "Great," but not "Fast" since it is the second condition.
To get around this, you could make one calculated field for each of the words of interest and make the titles of these calculations the words themselves:
IF CONTAINS([Comments], "great")
IF CONTAINS([Comments], "fast")
IF CONTAINS([Comments], "slow")
Then drag Measure names to Filters and select all of these fields, drag measure values to size and Measure names to Text. Playing with this myself, the formatting is weird, but maybe this is a good jumping off point to build from.
This is also something I find quite interesting. As far as I am aware there is no way to do this well in Tableau.
I would be cautious when approaching this as you may pick up the word "great" from a sentence and include that in your word cloud, but not pick up a "not" before it. They may actually be saying "The service was not great". They may also be using sarcasm in their response. The point is that you may be including these words in your word cloud without understanding the actual sentiment.
If you are going to extract words from their context, you may want to consider using a sentiment analysis tool otherwise you may be reporting incorrect results.
Thanks Benjamin, for the guidance.
Thanks Ben. Yes, it is tricky and doubt it can be done.
The fact that the comments field contains user input text; can have n number of words make this more challenging.
Quite different than using a dimesion field which contains pre-defined values.
Also common words like the, was, is, have to be ignored.
It is doable through a hack/workaround, but I would highly recommend processing the data first in software like R or Alteryx. There are predefined algorithms that will automatically remove stop words, combine words by their root form, etc.
However, in the essence of anything is possible in tableau:
Since you want each item in your view to be a single word, you need some way to create a dimension/pill that lists out each individual word. You can do this using data densification - basically join the original table to another table with word indexes (i.e. 1-100). Then, feed that index through a calculated field to parse out the individual words:
FINDNTH([Text]," ",[Word])-FINDNTH([Text]," ",[Word]-1)))
I would also recommend joining this up to a list of stop words to be able to remove the non-descriptive words like You, your, an, the, etc. Here is one list that could be used:
Also, I used fake text off the tableau documentation website so the wordcloud is not terribly interesting and also only indexed words 1-20 for each. You would likely need to increase this if you have lengthy comments/reviews.
Mason, that's a neat solution when preprocessing the data isn't a viable solution...unfortunately FINDNTH is not available for all data source types.
1 of 1 people found this helpful
Here's an easy solution with only 2 easy steps, all in Tableau and no clunky calculated fields:
When you import your data, select the Comment column, left click and select "Custom Split". Use a space as a separator, and split "all". Each word in the comment will be pulled out and put in a separate column. Tableau will automatically create enough columns to cover the longest comment in your data set, so you aren't limited to the first N words like some other solutions.
Next, pivot the data**. Select all the newly created word columns, left click, and select "Pivot". Now you have a single column with every individual word from each comment and you can start transforming it. Best of all, if there are other dimensions to your data (product, date, region, etc) you can use those in your visualization. What were the comment trends over time? Were your efforts to address "quality" showing up in user feedback? What are people saying about product X over time, by customer segment?
You will still need to manually filter out 'the' / 'has' / 'of' / etc. But this isn't too difficult, and can be done with a simple filter.
**Note: When I've done this, Tableau doesn't always let you pivot the split data. If you don't see the pivot option after you split the comment field, export the data to CSV (Data -> export to CSV), re-import it, and then pivot the columns.