# Predicting yes/no result from multiple binary dimensions (plus some measures)

I know this is more of a statistical question, but ultimately I'd like to use Tableau (and R) to create the model, and besides, I like it here..

Let's say I have a house cleaning business. I have thousands of open contracts to fill with different details, some of which are binary:

Office space vs home

Weekend or weekday work

Day/Evening work

Pets/no pets

and so on..

Plus some variables like:

Size of house/office

Price per hour of work

Depending on these details, cleaners may choose or not to take the contract.

Given that many of these variables are somewhat dependent on each other (offices are likely to have higher sq footage) what are the best statistical methods to determine how much influence each may give to the outcome of job taken or not? Please don't get hung up on my choice of example - pretend it's pretty common to have pets in an office..

• ###### 1. Re: Predicting yes/no result from multiple binary dimensions (plus some measures)

Hi Alex,

I'm uncertain of its validity here but you could implement a Tableau-only (so no R or Python for any of these solutions) Student's T-Test to calculate the Z-Score or Bayesian methodology to test the probability of a positive or negative outcome of a hypothesis, both of which I have implemented (with a little help from my data-science team).

I shall speak to the data-science team on Monday for their ideas for this; the only caveat (which I know you are more  than capable of) is that this is more "old(er)-school" Tableau - Lookups and Windowing

I should add here, despite having a huge Tableau deployment, and having multiple R and Python servers, the Tableau admin team have been slow to spin-up either of these for Tableau functions resulting in my needing to re-build these simple calculations entirely in Tableau (and at the time of development, in v8.1)

Steve

• ###### 2. Re: Predicting yes/no result from multiple binary dimensions (plus some measures)

Thanks Steve - interested in what the data team have to say. Having done some research, I'm trying logistical regression via the XLSTAT plugin for Excel, and looking at Chi-Squared and p values - seems to be getting somewhere.

• ###### 3. Re: Predicting yes/no result from multiple binary dimensions (plus some measures)

I actually didn't think of this, I too am using chi-square with p-values for measuring a different part of the same tools although they need to go through the same z-score plug to apply the final % probability.

Having discussed with some of the data scientists, they too have suggested logistical regression so looks like you are on the right lines there

Steve