R Implementation Notes

Version 1

    R is an open source software programming language and a software environment for statistical computing and graphics. If Tableau Desktop (version 8.1 and later) has a connection to an Rserve server, which is a server that allows applications to access R functionality, you can use a set of four functions to pass R expressions to an Rserve server and obtain a result. Similarly, if you upload a workbook that contains R functionality to Tableau Server, then Tableau Server must have a connection to an Rserve server. See R Connection, in the Tableau Desktop help, for details. R is not supported for Tableau Online or Tableau Reader.

     

    This post describes how to install, run, and configure Rserve. It also discusses Optimizing R Scripts and R Security.

     

    To install Rserve

    1. Go to http://cran.r-project.org/bin/windows/base/, and download the current version of R for Windows.

    2. Run the downloaded executable to install the R package.

    3. Open an R console window—for example, by choosing R > Rx64 3.0.2 from the Start menu.

      Version numbers are subject to change.

    4. In the R console, type:

      install.packages("Rserve")

    5. When prompted to select a CRAN mirror, select a location near you.

    6. Note the location to which the package has been downloaded.

    7. Unzip the Rserve package ("Extract all files.")

    8. Add the directory containing R.dll to your path environment variable. For example, C:\Program Files\R\R-3.0.2\bin\x64.

    To run Rserve

    Enter a series of commands like the following:

    setwd("C:/Program Files/R/R-3.0.2/bin/x64")

    library(Rserve)

    Rserve()

     

    The setwd command sets the working directory for the R, so that Rserve knows where to look for the .cfg file and the authentication file.

    When you run Rserve from the R Console, it remains resident in memory, and is able to serve requests from Tableau workbooks even if you close the R Console. You can use Windows Task Manager to verify that Rserve is running.

     

    Alternatively, you can run Rserve directly from Windows, without using the R Console, by executing Rserve.exe.

     

    After you do so, you see a message in a command prompt window that reads:

    Rserve: Ok, ready to answer queries.

     

    When you close the command prompt window, Rserve terminates as well.

    Running Rserve automatically

    To start Rserve automatically every time you start Windows, copy Rserve.exe to the Startup folder in Windows.

     

    To suppress the command prompt window that otherwise appears when you run Rserve.exe from Windows, create a shortcut in your Startup folder like the following.

    C:\Windows\System32\wscript.exe C:\Program Files\R\R-3.0.1\bin\x64\invis.vbs "C:\Program Files\R\R-3.0.1\bin\x64\Rserve.exe" %*

     

    In this shortcut:

    • wscript.exe is the Windows script host

    • invis.vbs is a file you’ll need to create containing the following Windows Shell script that will start Rserve with a hidden window:

      Set WshShell = WScript.CreateObject("WScript.Shell")

      WshShell.Run """" & WScript.Arguments(0) & """" & sargs, 0, False

    Running Rserve in debug mode

    Rserve has a debug mode which you can start by running Rserve_d.exe instead of Rserve.exe in Windows or by typing Rserve(TRUE) in the R Console instead of Rserve().

    Debug mode writes all the traffic between Tableau and R (scripts, data, error messages). Traffic between Tableau and Rserve is not captured in Tableau logs.

    Advanced Rserve configuration options

    If you create a file named Rserv.cfg you can use it to customize Rserve. For detailed information on Rserv.cfg, see http://www.rforge.net/Rserve/doc.html#conf.

     

    If you are running Rserve from the R console, the Rserv.cfg config file must be in the current working directory—that is, the same directory as R.dll. This is the working directory you specified earlier with the setwd command. Use getwd to verify the current working directory.

     

    If you are running Rserve.exe from the command line, the Rserv.cfg file must be in the same directory as Rserve.exe:

    <drive>:\<installation_path>\R\R-<version_number>\library\Rserve\libs\i386 or \x64

     

    The sections below discuss some of the options available with Rserv.cfg.

    Using a remote Rserve with password protection

     

    You can connect to an Rserve process on a remote computer if you install Rserve on the remote computer as per the preceding instructions, and then add the following lines to the Rserv.cfg file on the remote computer:

    remote enable

    plaintext enable

     

    If you are using a remote Rserve, consider configuring it to require a password. However, only plain text authentication is supported. In this case, instead of the preceding, add lines like the following to your Rserv.cfg file:

    pwdfile C:/Program Files/R/R-3.0.1/bin/x64/RserveAuth.txt

    remote enable

    auth required

    plaintext enable

     

    The pwdfile should be in the format

    myusername mypasswd

    Running R expressions from Rserv.cfg

    You can use your Rserv.cfg file to preload packages that you would otherwise have to load from your scripts. To do this, add the keyword eval to your configuration file and follow it with an expression.

     

    For example, to load the mvoutlier library and create an object x and set its value to 10, add a line like the following.

    eval library(“mvoutier”); x = 10;

     

    Alternatively, you could put your R code in a text file and then call it from Rserv.cfg, using the keyword source:

    source C:/MyFolder/sample.R

    Rserve for multiple users

    If more than one user will be using an instance of Rserve, consider running Rserve on a Linux computer. In Windows, sessions are shared and one user can interfere with another. For example if two users are running R scripts and they both define a variable X in R, when one user updates the variable it will overwrite the value for the other user. This is not an issue when Rserve runs on Linux.

    Every calculation in Linux has its own session, which means that variables in the same workbook will not be shared when Rserve is running on Linux.

    For details on installing and configuring Rserve on Linux, see http://www.rforge.net/Rserve/doc.html#conf.

    Optimizing R scripts

    SCRIPT_ functions in Tableau are table calculations functions, so addressing and partitioning concepts apply. Tableau makes one call to Rserve per partition. Because connecting to Rserve involves some overhead, try to pass values as vectors rather than as individual values whenever possible. For example if you set addressing to Cell (that is, Set Calculate the difference along in the Table Calculation dialog box to Cell), Tableau will make a separate call per row to Rserve; depending on the size of the data, this can result in a very large number of individual Rserve calls. If you instead use a column that identifies each row that you would use in level of detail, you could "compute along" that column so that Tableau could pass those values in a single call.

    R security

    The following security issues should be kept in mind as you use R and Tableau together:

    • The data channel between Tableau and R is not encrypted.

    • Rserve server does not use authentication by default. It does support plaintext authentication (see "Using a remote Rserve service with password protection," above) which, if enabled, uses a very simple mechanism that uses an unencrypted password list stored in a file.

    • R functions can contain code which can harm security on the server where the Rserve is running. For example:

      • Access file system (read/write)

      • Install new add-on/R packages which can contain binary code (for example, written in C)

      • Execute operating system commands

      • Open network connections and download files or open connections to other servers