7 Replies Latest reply on Jun 26, 2019 2:24 PM by Rapinder Jawanda

    Feature Selection in tableau prep.

    Zachary Sanchez

      The data I am currently trying to clean for model predictions requires me to do a certain type of feature selection.

      What I am trying to do is this:

       

      Lets say I have one column in my data known as HasDetections. Its a simple Boolean value 1 being true, 0 being false.

      And lets say I have another column called ProductName. This one has string names being windowsdefender and esm

       

      so it might look like this: (The size is much bigger but this is just a small example)

      ProductName               HasDetections

      Windowsdefender         0

      Windowsdefender         1

      Windowsdefender         0

      Windowsdefender         0

      esm                               1

      Windowsdefender         1

      Windowsdefender         1

      esm                               0

      Windowsdefender         1

      Windowsdefender         1

       

       

      now in python i could separate these string values(after i have converted them into int's) while maintaining the integrity of the entire HasDetections column. Where as if i were to drop all the esm in tableau prep, it drops all the corresponding HasDetections.

      Python avoids this by separating Productname into 2 new columns, 1 being esm, the other being windows defender.

       

      So instead of this:

      ProductName               HasDetections

       

      It would look like this:(ProductName0= esm, ProductName1=WindowsDefender)

      ProductName0     ProductName1     HasDetections

       

      The other thing python does is replaces any of the other potential empty spaces in these new columns with the mean or sometimes the most common number. This allows me to maintain the integrity of the data.

       

      My question is: Can this be down in tableau prep or is it not that far along yet?