0 Replies Latest reply on Sep 23, 2019 12:18 AM by Curtis Frye

    Different Values Returned from R Script in Console vs. Calculated Field

    Curtis Frye

      Hi all,

       

      I'm trying to implement a random forest model in a calculated field using the randomForest package. The following R code returns a usable set of predictions when run in the console:

       

      library("randomForest")

      train <- read.table('F:\\GlassTrain.csv', sep=',', header=T)

      model = randomForest(as.factor(Class) ~ Attrib1+Attrib2+Attrib3+Attrib4+Attrib5+Attrib6+Attrib7+Attrib8+Attrib9, data=train)

      test <- read.table('F:\\GlassTest.csv', sep=',', header=T)

      predictions <- as.factor(predict(model, newdata = test[2:10]))

       

      predictions

      1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

      1  2  1  1  2  1  1  1  1  1  1  1  1  1  1  1  1  2  2  2  2  2  2  2  2  2  2  1  2  2  2  2  2  2  2  2  2  2  2  2  1

      42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

      3  1  1  3  2  7  5  2  5  6  2  6  5  2  6  7  7  2  7  7  7  7

      Levels: 1 2 3 5 6 7

       

      When I run the script in the attached .twbx file's RForest calculated field, I get a different set of values that happen to match the actual values in the Class field I'm trying to predict. I've attached the .twbx and two .csv files so you can test it if you like. Any idea why I'm getting different values?

       

      Thanks!

       

      Curt