10 Replies Latest reply on Oct 14, 2018 10:30 PM by Prabhu H

    Real-time sentiment analysis

    Nathan Graham

      Hello, I am working on a Tableau project that will analyze the sentiment of survey comments. The comments are updated daily on our SQL server. I would like set up an automated process that pulls the comment data from SQL, pushes them through a sentiment analysis tool like R or Microsoft Cognitive service, then have the data available on our Tableau Server for report usage.

       

      Could anyone please help explain how to set up a connection like this?

       

      Thank you.

        • 1. Re: Real-time sentiment analysis
          Bora Beran

          You can use a script calculation in Tableau that looks like this assuming your field that contains the text content is called CommentText

           

          SCRIPT_STR('library(syuzhet);

          len <- length(.arg1);

          result<-numeric(len);

          for (i in 1:len){

          token <- get_tokens(.arg1[i]);

          sentiment <- get_sentiment(token,method = "syuzhet")

          sentiment <- sentiment[sentiment!=0]

          result[i]<-mean(sentiment)

          }

          ifelse (result>0,"positive","negative"))',

          ATTR([Comment Text]))

           

          If you don't want to get positive vs negative but instead just want to get the sentiment score you can do the following instead

           

          SCRIPT_REAL('library(syuzhet);

          len <- length(.arg1);

          result<-numeric(len);

          for (i in 1:len){

          token <- get_tokens(.arg1[i]);

          sentiment <- get_sentiment(token,method = "syuzhet")

          sentiment <- sentiment[sentiment!=0]

          result[i]<-mean(sentiment)

          }

          result',

          ATTR([Comment Text]))

           

          Please make sure all the dimensions in your view are used for addressing (table calculation settings) for best performance.

           

          Another option to consider is to do the preprocessing and write results into the database and just read the sentiment scores into Tableau as any other number.

          1 of 1 people found this helpful
          • 2. Re: Real-time sentiment analysis
            Nathan Graham

            Hello Bora, Thanks for your help. I used your blog post to set up R within Tableau but had a couple of questions.

             

            1. How can I add a "neutral" category to the syuzhet script calculation your provided above.

             

            2. Can I summarize or count positive/negative results? I tried to set up a simple table like below, but was unable to use the data this way.

             

            Sentiment       Count

            Positive           15

            Negative          7

            Neutral             5

            • 3. Re: Real-time sentiment analysis
              Bora Beran

              Hi Nathan,

              The most flexible way I can think of for doing that is the following

               

              SCRIPT_STR('library(syuzhet);

              len <- length(.arg1);

              result<-numeric(len);

              for (i in 1:len){

              token <- get_tokens(.arg1[i]);

              sentiment <- get_sentiment(token,method = "syuzhet")

              sentiment <- sentiment[sentiment!=0]

              result[i]<-mean(sentiment)

              }

              as.character(cut(result,breaks=c(-Inf, 0.3, 0.5, Inf),labels=c("negative","neutral","positive")))',

              ATTR([Comment Text]))

               

              0.3 or less is negative, over 0.3 to 0.5 (inclusive) is neutral,  Above 0.5 is positive in this case.

               

              I hope this helps.

               

              Bora

              1 of 1 people found this helpful
              • 4. Re: Real-time sentiment analysis
                Bora Beran

                If you would like to get count of negatives or positives, the best option to go with is preprocessing in the database e.g. run the same R script and do an update in the database with results which would allow you to put an aggregate like COUNT on sentiment category from Tableau.

                 

                If you want to do all of this in Tableau, it gets a bit more complicated after you get the counts.

                 

                Getting the counts is easy

                 

                SCRIPT_INT('library(syuzhet);

                library(plyr);

                len <- length(.arg1);

                result<-numeric(len);

                for (i in 1:len){

                token <- get_tokens(.arg1[i]);

                sentiment <- get_sentiment(token,method = "syuzhet")

                sentiment <- sentiment[sentiment!=0]

                result[i]<-mean(sentiment)

                }

                c<-as.character(cut(result,breaks=c(-Inf, 0.3, 0.5, Inf),labels=c("negative","neutral","positive")))

                count(c)$freq',

                ATTR([Comment Text]))

                 

                But this will return 3 rows always if you have tweets in each of the 3 categories while your sheet will have many more rows (as many as the tweets etc. you have). To make sure you always return the right amount of rows, you can create a vector of NAs that has as many rows as your tweets and fill the top N rows with negative-positive etc.

                 

                You can do this the following way.

                 

                SCRIPT_INT('library(syuzhet);

                library(plyr);

                len <- length(.arg1);

                result<-numeric(len);

                for (i in 1:len){

                token <- get_tokens(.arg1[i]);

                sentiment <- get_sentiment(token,method = "syuzhet")

                sentiment <- sentiment[sentiment!=0]

                result[i]<-mean(sentiment)

                }

                c<-as.character(cut(result,breaks=c(-Inf, 0.3, 0.5, Inf),labels=c("negative","neutral","positive")))

                count(c)$freq;

                result <- rep(NA, length(.arg1));

                result[1:length(f)]<-f;

                result',

                ATTR([Comment Text]))

                 

                A little more massaging to make sure we always get 3 categories back even if there aren't any tweets that fall into a category.

                 

                SCRIPT_INT('library(syuzhet);

                library(syuzhet);

                len <- length(.arg1);

                result<-numeric(len);

                for (i in 1:len){

                token <- get_tokens(.arg1[i]);

                sentiment <- get_sentiment(token,method = "syuzhet")

                sentiment <- sentiment[sentiment!=0]

                result[i]<-mean(sentiment)

                }

                c<-as.character(cut(result,breaks=c(-Inf, 0.3, 0.5, Inf),labels=c("negative","neutral","positive")))

                counts<-count(c)

                result <- rep(NA, length(.arg1));

                cats<-data.frame(x=c("negative","neutral","positive"))

                f<-merge(x = cats, y = counts, by = "x", all.x = TRUE)$freq

                result[1:length(f)]<-f;

                result',

                ATTR([Comment Text]))

                 

                Now you have to add labels for negative, positive etc. and get rid of NAs in your view.

                 

                Which makes more sense to do as a sample workbook

                1 of 1 people found this helpful
                • 5. Re: Real-time sentiment analysis
                  Nathan Graham

                  Thanks Bora,  one other question,

                   

                  I'm getting a "Negative" score of  -0.386 on a comment with the text; "Thank You! You all are doing a great job".

                   

                  There are many other like this. Is there something inherent with the syuzhet package to cause this?

                   

                  When I use the other R package from your blog post ('Sentiment') I get a "positive" sentiment for the same comment.

                  • 6. Re: Real-time sentiment analysis
                    Bora Beran

                    Hi Nathan,

                    I updated the script and workbook in the above message. Can you give it another try? The issue was that syuzhet would compute score for each sentence individually then add them up. So if you had a comment that had multiple sentences and a large score, scaling would cause a short sentence with perfect positive score to look bad. I updated it to average the scores of each sentence in a given message instead of sum.

                     

                    Thanks,

                     

                    Bora

                    • 7. Re: Real-time sentiment analysis
                      Nathan Graham

                      Thanks Bora, that calculation worked, but for some rows, I am receiving a score of '-1.#IND'

                       

                      For example, here is the text comments for a row that has a score of '-1.#IND'

                       

                            "I have not had a reply from MSC regarding this matter. It is very concerning to me. I know the matter has to be reviewed, I am not sure in what time frame I will receive a response."

                       

                      Any idea why this row would not receive a numeric score?

                       

                      Thanks for the help.

                      • 8. Re: Real-time sentiment analysis
                        Bora Beran

                        Hi Nathan,

                        For this sentence the library is not finding any elements that it recognizes as negative or positive sentiment so it ends up with 0 words, and a NULL array which is causing this error.

                         

                        If you add the following

                         

                        if(length(sentiment) == 0) sentiment <- 0

                         

                        in the line following this

                        sentiment <- sentiment[sentiment!=0]

                         

                        The issue should go away.

                         

                        Thank you,

                         

                        Bora

                        1 of 1 people found this helpful
                        • 9. Re: Real-time sentiment analysis
                          Nathan Graham

                          Perfect! Thank you Bora for all your help.

                           

                          One more question. Is there a Rserve package/calculation I could use to find key words or topics?

                          • 10. Re: Real-time sentiment analysis
                            Prabhu H

                            Hi Bora,

                             

                            I have used your query but the output which i got for few of the text is blank and below is the example. Please do the needful.

                             

                             

                            Like wise i found more than 200 text were showing blank.

                             

                            Your valuable answer is highly needed.

                             

                            Regards,

                            Prabhu