2 Replies Latest reply on Dec 14, 2018 4:51 PM by Adrian Gigi

    Replication of chisq.test function in Tableau with R

    Adrian Gigi

      @@@Hi there,

       

      1. Tableau version 2018.3

      2. Packaged workbook included (also some screenshots)

      3. "Finally, and you might have seen this comment elsewhere, forget Excel. Don't try to make Tableau behave like Excel. They are very different paradigms. If you want Excel behavior, use Excel!"

      - I know and you're definitely right, but I have to know if it's possible!

       

       

      I'm very new to Tableau and have limited to zero statistical knowledge. But I'm with my back against the wall and after a few days of hard work trying to figure this out, so I desperately need help.

      What I have to do is to replicate an Excel chisq.test formula in Tableau. I read in this forum the solutions proposed by @Bora Beran and @Scott Tennican and I looked over their examples (and others), but can't adapt them to my needs.

      I need a formula for the p-value field in the attached workbook. Right now in there is an R script which doesn't work (although testing it in RGui with raw numbers gives the same results as Excel, this is why I took this path), but I'm happy with any other solution. Also, I don't need the intermediate fields (obs_yes, obs_no, exp_yes, exp_no), I just tried to have as many check points as possible (and they match with the Excel values, but I have a felling that it has to be done with a different approach).

      This is the existing R script, which returns NaN for all rows:

       

      SCRIPT_REAL("

      obs_yes = .arg1;

      obs_no = .arg2;

      exp_yes = .arg3;

      exp_no = .arg4;

       

       

      if ( !is.na(obs_yes) && !is.na(obs_no) && !is.na(exp_yes) && !is.na(exp_no)) {

          ChiSq = chisq.test(c(obs_yes,obs_no), p=c(exp_yes,exp_no), rescale.p=TRUE);

          ChiSq$p.value;

      } else {

          {NaN};

      }

      "

      ,[obs_yes],[obs_no],[exp_yes],[exp_no])

       

       

      Any help / encouragement / good word is welcomed, as I said I'm really down now...

       

      Thank you for your time!

       

      Adrian

        • 1. Re: Replication of Excel chisq.test function in Tableau with R
          Adrian Gigi

          Hi,

           

          I'm back with other tests.

           

          These are the Excel values (I left only two rows):

          p-value formula in Excel is: CHISQ.TEST(F4:G4,H4:I4)

           

          and this is how Tableau looks:

           

          p-value formula in Tableau is:

          What I see in R debug for df1 and df2 looks OK to me:

          but the p-value is not correct - looks like it is calculated over all values?

           

          If I pass in R script only one row, then the p-value is correct:

           

          but this doesn't help, as it is a static calculation.

           

          Any ideas?

           

          Thank you,

          Adrian

          • 2. Re: Replication of Excel chisq.test function in Tableau with R
            Adrian Gigi

            I solved the problem, inspired this thread.

            The issue was that the R calculation should "spit" a vector of values instead of a single value.

            Here is the solution, just in case anyone needed it:

             

            SCRIPT_REAL("

            obs_yes = .arg1;

            obs_no = .arg2;

            exp_yes = .arg3;

            exp_no = .arg4;

             

            if ( !is.na(obs_yes) && !is.na(obs_no) && !is.na(exp_yes) && !is.na(exp_no)) {

                out <- rep(NA,length(obs_yes));

                for(i in 1:length(obs_yes)) {

                    out[i] <- chisq.test(c(obs_yes[i],obs_no[i]), p=c(exp_yes[i],exp_no[i]), rescale.p=TRUE)$p.value;

                }

                out

            } else {

                {NaN};

            }

            "

            ,[obs_yes],[obs_no],[exp_yes],[exp_no])