2 of 2 people found this helpful
My thought would be to do the cleaning like you said above. Specifically, I would
- Remove Letters
- Remove Punctuation
- Trim Spaces
Now, like you said, you lose "Don't Know", but that's the one value that no longer has a value:
so, you can just click that and type "don't know"
That's 4 quick changes per field, so not too much time.
And hopefully you could set it up once and then use one of the methods described here (https://vizpainter.com/new-input-files-in-tableau-prep/ ) when you get a new file each month.
The above is the basic approach, but I would probably do a couple of slight variations to make it easier to work with the resulting data in Tableau:
- I would pivot all the questions that have the Likert Scale response:
That way I can clean all of them at once plus I think the data will be mostly easier to work with afterward.
- I would do the first 3 clean steps as described above, then I would have blanks (which are "don't know") NULL values which are ("didn't answer") and numbers for the rest.
- You could rename them as above, but I would be more inclined to leave them as is and change the data type of the field to Number (whole). The reason is that now I have numbers where they apply and NULLs where the answer was either "don't know" or not answered - and that can be very helpful if I want to do things like show the Average response (because NULLs won't impact the average calculation)
You can see how easy it is (just drag and drop two fields) in Tableau to start doing analysis:
The only caution after pivoting (which definitely makes the cleanup of a bunch of fields a lot easier and makes it easier to do analysis) is that you'll end up with more rows than you started with, so be careful in using any row counts in your analysis (i.e.it's not one row per respondent anymore, it's one row per question)
I've included a packaged flow (.tflx) for you to see what I did.
Hope that helps!
Thank you! That definitely helps with the formatting. Prior to Prep, I was pivoting in Tableau but was only using a few fields. This will be much easier.
Also I apologize but noticed in your response that I included the wrong data set. I saved the new file name but forgot to re-save after my edits to only display the 5 columns I wanted to share. I'd prefer this data is not available to the community at large. I edited the file I initially attached in this question. The screen shots are fine, but if you could remove the packaged flow, I would appreciate it - better than having to try and get the post deleted.
I've removed the packaged flow as requested and have deleted the file from my machine. Please double check the screenshots above to make sure there's nothing there that shouldn't be - if needed I'll adjust them.
The screen shots are fine as is - I think they will help others better understand the inquiry and they don't display any sensitive customer information.