8 Replies Latest reply on Jul 26, 2019 11:09 AM by Nathan Mannheimer

    R Integration With Tableau - R Script to Calculated Field

    Erika Pan

      I’ve been using the R script below and I was thinking whether there’s a way to implement it in Tableau:

       

      Build a Simple Cosine Similarity Search Engine in R

       

      I’ve seen a lot of r integration examples with tableau but most that I’ve seen were simple examples/ applications. I’m not too sure if tableau can handle functions similar to the example in the link.

       

      I envision the program to work as such:

       

      I import data to tableau with a field  that contains thousands of book titles

       

      I have a blank search bar in tableau where I can enter a phrase similar to what is shown in the link.

       

      once entered, tableau will filter the top (50, 100, etc..) book titles and show these  records in tableau.

       

      Tableau will serve as the visualization tool for the characteristics of the generated data. (e.g, pareto charts, cross tab showing actual book titles, etc)

       

      Is this possible? in short - create a search engine type of program in tableau and harness its visualization capability.

        • 1. Re: R Integration With Tableau - R Script to Calculated Field
          Nathan Mannheimer

          Hi Erika,

           

          Since your script looks like it would return a single value, ie the similarity score, this should be possible with Tableau's R integration.

           

          Tableau doesn't have any limitations on the complexity of the R script you run, it only requires that you return either a constant result, or one result for each row of data passed. In this case, you would be returning a similarity score for each title so it would work well.

           

          I would imagine your script would look something like this:

           

          SCRIPT_REAL("<your R code>", ATTR([Book Title]), [String Parameter])

           

          Where your R code takes in a vector of book titles as well as the string constant from a parameter and then your algorithm assigns a score to each book and returns them as a vector. You could then use a rank table calculation to filter the results.

           

          My video from TC18, while mostly focusing on Python has R examples as well, and should be a good guide to how to write your script for Tableau and how to integrate parameters: Data science applications with TabPy/R - YouTube

           

          Happy to chat further as well.

           

          -Nathan

          • 2. Re: R Integration With Tableau - R Script to Calculated Field
            Erika Pan

            Thank you! I attempted to replicate and I was unsuccessful. It's probably because I'm not there yet to fully understand the integration. I am getting this error when I run the script:

             

            Error in base::parse(text = .cmd) : <text>:83:1: unexpected symbol 82: 83: Return ^

             

            It appears tableau/r is not recognizing the return function near the end of my script.

             

            I created a calculated field named Score and entered the script below. My expectation is the score would be generated once I add it as a row in Tableau together with Unique Id, etc... and would change values when I enter a text in the string parameter field I created.

             

            Lastly, for the parameter, I followed the convention through this link: Is it possible to use parameters within R scripts in calculated fileds? It says that the argument needs to have a [1] after the .arg

             

            For some reason, there's something I'm not doing right causing the error:

             

             

             

            SCRIPT_REAL("

            library(tidytext)

            library(dplyr)

            library(tm)

            library(SnowballC)

            library(dplyr)

            library(stringr)

            library(shiny)

             

            # Description and Issue Titles are characters

            LongText <- as.character(.arg1 )

            IssueTitle <-as.character(.arg2)

             

             

            #combine title and long text

            Combined.Text <- (paste(IssueTitle, LongText, sep = ' '))

             

             

            #ensures all text characters are readable

            Combined.Text <- sapply(Combined.Text, function(x) iconv(enc2utf8(x), sub = 'byte'))

             

             

             

             

            data <- data.frame(Combined.Text) %>%

              mutate(Combined.Text=gsub('(http|https).+$|\\n|&amp|[[:punct:]]',' ',Combined.Text),

                     rowIndex=as.numeric(row.names(.))) %>%

              select(Combined.Text,rowIndex)

             

             

            docList <- as.list(data$Combined.Text)

            N.docs <- length(docList)

             

             

            QrySearch <- function(queryTerm)

             

             

             

             

            {

             

              # store docs in Corpus class which is a fundamental data structure in text mining

              my.docs <- VectorSource(c(docList, queryTerm))

             

             

              # Transform/standaridze docs to get ready for analysis

              my.corpus <- VCorpus(my.docs) %>%

                tm_map(stemDocument) %>%

                tm_map(removeNumbers) %>%

                tm_map(content_transformer(tolower)) %>%

                tm_map(removeWords,stopwords('en')) %>%

                tm_map(stripWhitespace)

               

             

              # Store docs into a term document matrix where rows=terms and cols=docs

              # Normalize term counts by applying TDiDF weightings

              term.doc.matrix.stm <- TermDocumentMatrix(my.corpus,

                                                        control=list(

                                                          weighting=function(x) weightSMART(x,spec='ltc'),

                                                          wordLengths=c(1,Inf)))

             

             

             

              # Transform term document matrix into a dataframe

              term.doc.matrix <- tidy(term.doc.matrix.stm) %>%

                group_by(document) %>%

                mutate(vtrLen=sqrt(sum(count^2))) %>%

                mutate(count=count/vtrLen) %>%

                ungroup() %>%

                select(term:count)

              docMatrix <- term.doc.matrix %>%

                mutate(document=as.numeric(document)) %>%

                filter(document<N.docs+1)

              qryMatrix <- term.doc.matrix %>%

                mutate(document=as.numeric(document)) %>%

                filter(document>=N.docs+1)

             

             

             

              # Calcualte top  results by cosine similarity

              searchRes <- docMatrix %>%

                inner_join(qryMatrix,by=c('term'='term'),

                           suffix=c('.doc','.query')) %>%

                mutate(termScore=round(count.doc*count.query,4)) %>%

                group_by(document.query,document.doc) %>%

                summarise(Score=(sum(termScore)) %>%

                arrange(desc(Score)) %>%

                left_join(data,by=c('document.doc'='rowIndex')) %>%

                ungroup() %>%

                select(Score) %>%

                data.frame()

             

             

            Return(searchRes)

             

             

             

             

             

             

             

             

            QrySearch('.arg3[1]')",

             

             

            ATTR([Long Text]), ATTR([Issue Title]), [String Parameter])

            • 3. Re: R Integration With Tableau - R Script to Calculated Field
              Nathan Mannheimer

              Hi Erika,

               

              For R, you don't need to include Return(), that's only needed for Python. I think the error is throwing becaust Return() isn't recognized in R.

               

              If you end your scrip with searchRes it should work.

               

              You will also need to have [Long Text] and [Issue Title] as dimensions in your view and set the table calculation to address those dimensions as in the video, so that it passes each field as one long vector.

              • 4. Re: R Integration With Tableau - R Script to Calculated Field
                Erika Pan

                Hi, Nathan

                 

                I tried it but it still didn't work. I feel I'm doing something wrong but couldn't pinpoint the error. Another thing I did was rather than do a read.csv in tableau e.g.,

                 

                books <- read.csv("")

                 

                books$long.text

                book$issue.title

                 

                I made modifications in the first few lines of the script (refer to my prior post) as my assumption is that the data is already directly getting called from tableau and the above codes were now unnecessary. I replaced with what I have posted prior.

                 

                I could send my .twb file (replaced with sample data) if that would work. However, I don't see an option to attach the file.

                 

                Unfortunately, I'm still using Tableau 10.2

                 

                Thanks,

                • 5. Re: R Integration With Tableau - R Script to Calculated Field
                  Nathan Mannheimer

                  Hi Erika,

                   

                  Can you send your workbook and some sample data to nmannheimer@tableau.com?

                   

                  Thanks,

                  Nathan

                  • 6. Re: R Integration With Tableau - R Script to Calculated Field
                    Erika Pan

                    Hi, Nathan

                     

                    I did some tweaking with the formula and now this is what I'm getting:

                     

                    The result returned by the SCRIPT function is of an unexpected type.

                     

                    I found out how to attach the workbook to the original thread.

                     

                     

                     

                    Thanks in advance,

                    • 7. Re: R Integration With Tableau - R Script to Calculated Field
                      Erika Pan

                      Nathan Mannheimer I've attached the file to my previous reply. Thank you!

                      • 8. Re: R Integration With Tableau - R Script to Calculated Field
                        Nathan Mannheimer

                        Hi Erika,

                         

                        Thanks, I'll take a look at this ASAP.

                         

                        -Nathan