5 Replies Latest reply on Nov 6, 2010 3:22 PM by Richard Leeke

    Advanced data reshaping question

    Patrick Mcmahon

      For those that have used the data reshape macro in Excel, I have an advanced question. 

       

      First off, it is awesome...  Nuff said.

       

      Secondly, has anyone ever done any updates to deal with multi-value cells?

       

      For example, I may have a cell that has multiple values separated by a Line Feed.

       

      Today, the pivot macros correctly moves it to a new line, however it keeps both values in the reshaped cell.  I'm assuming there is no logic that enables it to separate multi-value fields.

       

      How are others handling that?  or has anyone created a super enhanced macros that incorporates that functionality?

       

      Thanks!!!!

        • 1. Re: Advanced data reshaping question
          Joe Mako

          I am not sure there is a standard logic for how to handle "a cell that has multiple values separated by a Line Feed."

           

          What does your data look like before, and how do you need it after reshaping (can you provide a sample Excel file in a similar format)?

           

          If it is something you have to do on a regular basis, I would recommend writing a macro that prompts for a range of cells, and separates the values from many in one cell to one value per cell.

          • 2. Re: Advanced data reshaping question
            Patrick Mcmahon

            Joe - thanks.  Here is a spreadsheet with a sample piece of data (generalized and without as many columns, but imagine that for columns K,L & M, there would be dozens that are similar.

             

            The Base Data tab is the output from the tool  (I wish it was easily changeable, but it is not, as I don't have control over the source export functions).

             

            The Base Data-Tableau As Is is how the Reshape Data function works its magic... It is really, really close..

             

            The Base Data-Tableau To Be is how I would like it to be.

             

            The reason is because the type of analysis I am after. For example:

             

            For estimating technique used, how many of which types were used?  This can be accomplished by a count distinct by Record ID.  (It is ok that there is more than one value for  these fields...)  Same type of question for Method Used and PM Tool used, etc.

             

            Over time, the list of values for a particular question can change, therefore it is not easy to build some sort of logic to extract for possible responses, though that is one option...

             

            Net-net:  The answers to those questions are multi-select fields, where more than one answer is acceptable...  BUT, I need to be able to count how many of each unique response was provided ... (versus, how many of what combination of responses were provided...) :-)

             

            Yes, I do this quite often.. and I wish I could change the source to at least give me separate records for each value... but I can not...

             

            Patrick

            • 3. Re: Advanced data reshaping question
              Patrick Mcmahon

              bump to see if anyone has any thoughts..! :-)

              • 4. Re: Advanced data reshaping question
                Richard Leeke

                I'm sure writing an Excel macro to do what you want would be easy for someone who knows VB.

                 

                The easiest way to do it for me would be to export the results of Tableau's unpivot macro to a CSV file and then write a little perl script to expand it out.  You could then either connect to the CSV as your data source or import it back into Excel.

                 

                That's just easier for me because I know perl much better than I know VB.

                • 5. Re: Advanced data reshaping question
                  Richard Leeke

                  Here's a little bit of perl that does what you want.  It assumes that the only fields Excel has to wrap in quotes are the mult-line ones.  If you have fields with embedded commas or possibly a few other conditions which cause Excel to generate more quoted fields it might need a bit of tweaking.

                   

                  1) You will need download and install a copy of perl (it's free from ActiveState: http://www.activestate.com/activeperl/downloads).

                   

                   

                  2) Export the results of the Tableau Macro as a csv file.

                   

                   

                  3) Run the perl script from a command window, giving it the name of the CSV file and redirecting output to a new file:

                   

                  split_repeating_fields.pl Multi.csv > Split.csv

                   

                   

                  4) Either connect directly to the new CSV file or import it back into Excel.