    Chi square reading first two lines of table - R Integration

    Rachel Factor



      I am trying to get a p-value from a chi square test between two groups of schools. I am looking at demographic makeup, for which each cell tells me the number of students in that school. My data looks something like this:


      SchoolName    RaceIndicator   Number_of_Students

      SchoolA          Hispanic          100

      SchoolA          AfricanAmer     30

      SchoolA          White                200

      SchoolB          Hispanic          200

      SchoolB         AfricanAmer     25

      SchoolB          White               400



      I have used tips from other Tableau community threads on the topic of Chi-square testing but they are set up to count frequencies, whereas my data would need to be weighted.


      Since I can produce the tables I want in Tableau, I thought I could do an easy work-around by just outputting the first two rows of data into R.


      This is my script:


      a <- .arg1
      b <- .arg2

      data <- rbind(a,b)

      INDEX()=1, INDEX()=2)


      I also tried creating a "RowNumber" calculated field and input that for .arg1 and .arg2. Both do not seem to work and consistently spit out "0.317".


      This is what my dashboard looks like currently:


      I know from manually putting this data into R that the p-value is wrong.


      Please let me know where I erred!

        • 1. Re: Chi square reading first two lines of table - R Integration
          Mary Solbrig

          I am confused why you are using "INDEX()=1" and "INDEX()=2" as your arguments. This would mean that the variables a and b in the code are vectors of true/false values.


          From the question, my guess is that you would like a to be the vector of Number of Students from School A and b the vector of students from school B, so that with the example data a = c(100, 30, 200) and b=(200,25,400). Is this correct?


          If so, then I would suggest creating new fields to return counts for school A and school B. I've attached a workbook that uses your example data to return a p.value of .005762.


          If this isn't correct, could you provide an example of the code you are running in R for comparison?

