5 Replies Latest reply on Sep 7, 2018 11:41 AM by Vivek Shekar

    Accessing R's mvoutlier but error while executing in Tableau

    Vivek Shekar

      Dear Community,

       

      I am using a sample superstore data to create some basic outlier calculation using SCRIPT_INT as shown below:

      SCRIPT_INT( "library(mvoutlier); sign1(data.frame(.arg1,.arg2))$wfinal01", SUM([Sales]), SUM([Profit]))

       

      It says calculation is valid, but when I use that in the view, it gives me the following error

      An error occurred while communicating with the RServe service.

      Error in sign1(data.frame(.arg1, .arg2)) : More than 50% equal values in one or more variables!

       

      I am attaching the tableau packaged workbook as well as the error screen shot.

      Can someone help me on this ASAP please?

        • 1. Re: Accessing R's mvoutlier but error while executing in Tableau
          Nathan Mannheimer

          Hi Vivek,

           

          There seem to be two issues here: the first error you are seeing is the result of how the table calculation in Tableau is computing the R script. Table calculations will aggregate data to the level of detail in the visualization, in your case Customer Name. They can also be set to compute over different windows, which can cause the calculation to restart when a new window begins. In this case, the window is being set to restart for each Customer Name, meaning that the vectors being passed to R are of length 1, ie all values in the list are identical. For more information on this, check out Rachel's blog post on this topic.

           

          This can be resolved by right clicking on the Outlier function and selecting edit table calculation:

          Table Calc R 1.png

          You can the select Specific Dimensions. Any dimension checked in this menu will not cause the calculation to restart. When Customer Name is checked, we will pass all sales and profit values to R in vectors of length 793:

          Table Calc R 2.png

          This led to a second issue with the code: the sign1 function seems to expect at least 3 different variables. When run with 2 variables it throws a linear algebra matrix multiplication error. The sign2 function calculates the distances slightly differently, but should produce essentially the same results and does not have this error when calculating outliers. I used the following code:

           

          SCRIPT_INT(

          "library(mvoutlier);

          sign2(data.frame(.arg1,.arg2),makeplot=FALSE)$wfinal01",

          SUM([Sales]), SUM([Profit]))

           

          You can see the results in the attached workbook!

           

          Sincerely,

          Nathan

          • 2. Re: Accessing R's mvoutlier but error while executing in Tableau
            Vivek Shekar

            Thanks a lot for your reply on this issue. I really appreciate it.

            • 3. Re: Accessing R's mvoutlier but error while executing in Tableau
              Vivek Shekar

              Just one more question, according to Rachel's blog post on this topic:

              It says when you do Table across, then Tableau will send all the Data points on the view as a vector to R server and gets back the results.

              However if I do Table Across with this example then it gives me the following error:

              Error in sign2(data.frame(.arg1, .arg2), makeplot = FALSE) : More than 50% equal values in one or more variables!

               

              So it is contrary to what Rachel's post say. Could you please explain a little bit on that?

              • 4. Re: Accessing R's mvoutlier but error while executing in Tableau
                Nathan Mannheimer

                Hey Vivek,

                 

                Using the table down or table across commands will essentially build the vector in the direction across or down the table, one row or column at a time. In Rachel's example, she is using a time series which creates a very wide table as dates move forward in time, one column for each unique date member. Going table across creates a time-series vector with a length of the number of date members.

                 

                In your case, the 'table' underlying the data has only two columns, Sales and Profit, and a number of rows equal to the number of Customer Names. Going table across tries to make a unique call to R for each customer with a vector that has length 2: the Sales and Profit values for that customer. In my experience, using the Specified Dimensions option can be a great way to ensure the calculation works correctly if you aren't sure exactly what the 'table' structure may look like.

                 

                For any viz, if you do want to see what the table that Tableau uses for table calculations looks like, right click on the sheet name in the bottom of the view and select Duplicate as Crosstab:

                Crosstab1.png

                Crosstab2.png

                1 of 1 people found this helpful
                • 5. Re: Accessing R's mvoutlier but error while executing in Tableau
                  Vivek Shekar

                  Dear Nathan,

                   

                  Thanks a lot:) it was very helpful.