1 Reply Latest reply on May 13, 2017 6:59 AM by Manoj Kumar Billa

    Formatting Data with categorical variables & value ranges (in text format)

    Manoj Kumar Billa

      Hi guys,


      I am analyzing the Stackoverflow dataset and it has a lot of columns which have categorical values (See these columns starting with important_* where all the values are categorical with a specific rank).


      I have 2 questions:

      1. How can I "Rank" the categorical variables? For example, I want to create a spider chart with the "Very important" values on the outer rings and "not important" rows in the inner ring. So how to rank the variables to do this?



      2. In the below screenshot, there are some columns like team_size_range which are identified as text. Is there any way by which I can tell Tableau that these are value ranges?


      Thank you

        • 1. Re: Formatting Data with categorical variables & value ranges (in text format)
          Justin Larson

          For starters, spider charts are not a basic chart type Tableau supports, as they are not on a cartesian plane. I'm sure you could hack it with a bunch of fancy math to project your values onto a Radial projection, but it would be tricky. I'm sure there are lots of blogs out there walking through the complexities. This came up right off with a google search: Use radar charts to compare dimensions over several metrics | Tableau Software


          As for how to turn into metrics that can be visualized. I would just take each of those columns and turn them into numeric representations with a formula such as


          Case [Important_control]

          when 'This is very important' then 3

          'This is somewhat important' then 2

          'I don't care about this.' then 1



          Doing so essentially creates a scorecard for each response that is easier to represent mathmatically, take averages, etc.


          As for Range Values, there is not a datatype for ranges, per se, the different combinations will be treated as discrete values. You can manually specify ordinality of the ranges in the built-in Sort dialog box, or you can use a formula just like above (Case statement) to assign each range bucket a sort order value, and drop that into the view.


          Alternatively, all of your metrics may simply be percentages of respondents in each question's set of answers, in which case, really all you need to do is count records and drop different dimensions on the view.


          You may want to review some of the resources here: Visualizing Survey Data – Data Revelations


          Lastly, I find that for survey data, the visualizations are much easier to deal with when I [un]pivot the data prior to loading into Tableau. That can be done a number of ways, depending on your data source.


          These are the two I almost always use:

          Unpivot columns (Power Query) - Excel      --good for practically any source. limited container size for big data sets, however

          Using PIVOT and UNPIVOT                --this article is specific to T-SQL (Microsoft SQL Server) but it's a common SQL operator, supported by a number of RDBMS

          2 of 2 people found this helpful