2 Replies Latest reply on Sep 7, 2016 4:57 PM by Mary Solbrig

# Chi-Square P-Value Calculation with R

Hi all,

I've been trying to figure out how to use the Tableau/R integration to run Chi-Square tests. I've downloaded Bora Beran's workbook from this blog post.

I've also gone through a previous tableau community discussion where Bora has gone through each line of his code to explain it.

So this is the calculated field, P_Value_ChiSquare:

SCRIPT_REAL(

'mm <- data.frame(commodity = .arg1);

d <- split(mm,rep(1:.arg3[1],each=.arg2[1]/.arg3[1]));

zt <-do.call(cbind, d);chisq.test(zt)\$p.value'

,SUM([Number of Records]), SIZE(),[Priorities])

[Priorities] is calculated by WINDOW_COUNT(ATTR([Order Priority]))

I'm connected to Rserve. In the workbook example, this calculated field outputs a p-value of .4205. If I then replace the  P_Value_Chi-Square with itself (so that the calculation is made again through my connection to Rserve) - I get an Error:

Error in chisq.test(zt) : 'x' must at least have 2 elements

Any ideas as to why this isn't working for me? I've made no changes to the calculated fields.

I discovered that it wasn't working for me in this workbook after trying to apply the process to my own data/workbooks.

• ###### 1. Re: Chi-Square P-Value Calculation with R

Mary Solbrig any advice?

• ###### 2. Re: Chi-Square P-Value Calculation with R

Suspicion: you are only passing 1 element over at a time, meaning that you are calculating the test on a single value instead of the vector of values. For instance, if you ran "chisq.test(c(1))" in R, you would get the same error message.

To verify whether this is the case:

1. Start Rserve using run.Rserve() (this keeps it in the session so any print statements will appear in R Studio instead of being swallowed)

2. Replace the code with the following:

SCRIPT_REAL(

'

print("arg1 is ")

print(.arg1)

print("which has size ")

print(.arg2)

mm <- data.frame(commodity = .arg1);

d <- split(mm,rep(1:.arg3[1],each=.arg2[1]/.arg3[1]));

zt <-do.call(cbind, d);

print("Taking chisq.test of ")

print(zt)

chisq.test(zt)\$p.value'

,SUM([Number of Records]), SIZE(),[Priorities])

That will give you a sense of what exactly is being passed to R.

Let me know what that says and I might be able to help further. A sample workbook, even just with some dummy data, would also be really helpful.