3 Replies Latest reply on Jun 11, 2018 7:08 AM by Shinichiro Murakami

    Calculated field counting unknown keywords

    Lisa Bostrom Einarsson

      I have an issue that I have been trying to solve for the better part of a week now. I have a large database (in Google sheets) representing casestudies. I have some columns with multiple categories listed (in this example 'species', 'genera', and 'morphologies'), and I want to be able to tally how many times each category occurs in the data set. If a casestudy contained multiple categories (species/genus/morphologies) the data was entered in a single cell, separated by commas. These are the grey fiels in my google sheets, and columns wihtout numbers (E.g "Species", "Genera"). I've also split these into mulitple columns (Numbered fields e.g. Species 1, Species 2...). It doesn't really matter weather the analysis deals with the CSV column, or the split columns.

      The final output will be a large publc tableau. I know I can do a "find" based on the specific string, but I'd like the dataset to be dynamic and be able to handle new data being added without having to update calculated fields? Is there a way of finding uniqe terms (either from a single column of comma separated values, or from multiple columns), and tallying them?

      Things I have tried so far:

      1 - A pivot table in Tableau (pivoting on the split columns for each variable separately). Works well, but messes with all the other data, since it repeats lines.
      2 - A pivot table on its own data source in Tableau. Also works well, and avoids the problem of messing with the other data. However, now each figure (e.g species/genus/morphologies) is disconnected from the others so I can't do a large dashboard where everything is filtered by each other (ie filtering species and genera by country at the same time).
      3 - An SQL query() in google sheets, which finds all unique terms and queries them, which can then be plotted in Tableau. Also works well, but similar problem of the data being disconnected from all the other terms in the dataset.

      Any ideas of a field calculation that will find, list and tally unique terms in a single comma separated column (or across multiple columns), without changing the data structure?

      I have placed a sample data set here (google sheets. EDIT: link might not work, it's also attached!), which is a smaller version of what I'm actually working on. In it I have marked comma separated columns in grey, and they're followed by a bunch of columns with the values split into columns. I only need to analyse either of those (ie either a calculation to separate comma separate values or from multiple columns).

      I've also added a sample Tableau workbook here, which illustrates my issue with perfoming a pivot table (compare numbers in the two countries sheets.


      Edit: I should also mention that I'm on Tbaleau desktop for Mac, so custom SQL options are not available to me


      Message was edited by: Lisa Bostrom Einarsson [added .xlsx file to question]