2 Replies Latest reply on Oct 24, 2016 6:22 PM by mark schukas

    Text Mining with LDA method

    mark schukas

      I'd like to run the LDA (latent dirichlet allocation) model inside of Tableau...?

       

      here's how I do it inside of R:

       

      #tda   library(topicmodels)latent dirichlet allocation

      k = 2;

      SEED = 1;

      my_TM =

        list(VEM = LDA(dtm, k = k, control = list(seed = SEED)),

             VEM_fixed = LDA(dtm, k = k,

                             control = list(estimate.alpha = FALSE, seed = SEED)),

             Gibbs = LDA(dtm, k = k, method = "Gibbs",

                         control = list(seed = SEED, burnin = 1000,

                                        thin = 100, iter = 1000)),

             CTM = CTM(dtm, k = k,

                       control = list(seed = SEED,

                                      var = list(tol = 10^-4), em = list(tol = 10^-3))))

       

      ??

       

      thank you.

        • 1. Re: Text Mining with LDA method
          mark schukas

          I've tried to pair down the code and just create the Corpus:

          (Answer is a Dimension)

           

          Error:  "Error in paste(Answer, collapse = " ") : object 'Answer' not found "

           

           

           

          review_text <- paste(Answer, collapse=" ") #paste every review together, separating with a space

          review_source <- VectorSource(review_text) #set up the source and create a corpus.

          corpus <- Corpus(review_source)  #create Corpus

          corpus <-tm_map(corpus, content_transformer(tolower)) #cleaning the text.

          corpus <-tm_map(corpus, content_transformer(removePunctuation))

          corpus <-tm_map(corpus, stripWhitespace)

          corpus<-tm_map(corpus, removeWords, stopwords("english"))

          corpus <- tm_map(corpus, stemDocument)

          dtm <- DocumentTermMatrix(corpus)    #create the document-term matrix

           

          thank you.

          • 2. Re: Text Mining with LDA method
            mark schukas

            any thoughts...?