5 Replies Latest reply on Feb 26, 2019 1:25 PM by Jana STUPAVSKY

    Restructure multiple hurricane datasets with a header line (NOAA's HURDAT2 dataset)

    Jana STUPAVSKY

      I am looking for help with restructuring Atlantic hurricane data since 1850. I created a similar post in forums, but it may have indicated that I am looking for help with plotting hurricane paths, rather than needing help with the dataset. I think I can figure out the hurricane paths, but this data prep is a bigger struggle for me than I anticipated.

       

      Each storm since 1850 is like a separate dataset. See image below. There are two types of rows:

       

      1) a header line for each storm, which, if you follow my highlighted sections from left, includes: storm number for that year, storm name and number of storm track entries (39 in the example). That number tells you the total number of rows  below the header line.

      2) multiple data lines under each header line - it's the data entries for that particular storm that will help me plot the storm line - lat/long, times of the day for each track entry, date for each storm track entry, etc.

       

      I did some data prep and cleanup but I got stuck because I can't figure out the way to bring data from the header line into the rows under the header line. If possible, I need to somehow pivot selected sections of each header line in to each storm dataset. Ideally, I would need to create these columns from the header line:

      - Storm name - Irene in this case. I need "Irene" in each of the 39 rows below the header line.

      - Storm number for that year - 09 in this case. I need 09 in each of the 39 rows below the header line.

      - Storm row number - I need the rows to be numbered from 1 to 39 in this case.

       

      Can someone help me figure it out? I attach my packaged flow from Prep and the original data file.

       

        • 1. Re: Restructure multiple hurricane datasets with a header line (NOAA's HURDAT2 dataset)
          Jana STUPAVSKY

          Jim Dehner - sorry to bother, but do you know of other ways to seek help on this? I am getting nowhere with NOAA. I think I am pretty clear describing what I need above, I wonder if really there is none out there who could crack this.

          • 2. Re: Restructure multiple hurricane datasets with a header line (NOAA's HURDAT2 dataset)
            Joshua Milligan

            Hi Jana,

             

            This is one of those cases that can't fully be solved with Tableau Prep (at least, not yet!)  When something like this idea is implemented, Tableau Prep will be able to fully handle these types of problems (feel free to vote it up!) https://community.tableau.com/ideas/8732. The basic need is to be able to work across rows like you can with table calculations in Tableau Desktop.

             

            In the meantime, you can do just a bit more manipulation in Excel to get it to a state that makes it doable in Tableau Prep.

             

            Here was my approach:

             

             

            I added two columns, the Key and Row Type to carry information down the table.  The Row Type is calculated in Excel as:

             

            =IF(ISNUMBER(D3),"Detail","Header")

             

            Basically, it's looking at the D Column which is either a name or time (string or number) and using that to determine if the row is a header or detail.  That's going to be useful in Tableau Prep, but the Key column is the ... key (sorry).

             

            This uses the same basic logic but based on whether it's detail or header it either gets the storm ID or copies it from the previous result (I cheated by setting A2 to the value, but from that point on it works).  The calculation is:

             

            =IF(ISNUMBER(D3), A2, C3)

             

            Now you have a dataset that Tableau Prep can handle because you have an identification for each record. The basic idea is to split out the header records and detail records, clean them up a bit, then join them back together so you get all the header information for each detail record. (I've attached the flow so you can dig into it a bit more)

             

             

            Assigning a row number is another challenge in Prep.  The idea I mentioned would allow you to do this, as would this one: https://community.tableau.com/ideas/8714

             

            Since you have the Date & Time (which I calculated from the date and time fields), you do have an order.  Likely, that field alone would be enough in Tableau Desktop to allow you to do anything you wanted (e.g. the Pages shelf) or you could use Rank() or Index() in Tableau Desktop to get the specific order.

             

            If you truly do need a row order in the data set, then check out this post for some ideas on how you can get it:  https://vizpainter.com/sorting-rank-ranking-in-tableau-prep/

             

            Hope that helps!

            Joshua

            • 3. Re: Restructure multiple hurricane datasets with a header line (NOAA's HURDAT2 dataset)
              Jana STUPAVSKY

              Joshua - thanks for leading me in the right direction. I don't mind doing a bit of modification in Excel, if that's the only route for now.

               

              I really like this approach and I think we are close, but, I was unable to replicate the Key column. See attached. It seems like the formula stops working for me as soon as it hits the next set of storm data.

               

              As far as numbering rows, the only reason I need something like that is to have a unique ID for each recording of a storm position. But I think the Key column can solve that, if I can get it right in Excel. You would think that Date and time alone would be sufficient as a Unique ID, but as I found out, sometimes there are more than one storm that are active (duuh) and they check and record their position at the same time and date. So - either a Name or a storm number (ideally both) needs to be brought in as an additional component of a Unique ID. 

               

              Can you help me get that Key column right and I will try to take it from there, please? What am I doing wrong there?

              • 4. Re: Restructure multiple hurricane datasets with a header line (NOAA's HURDAT2 dataset)
                Joshua Milligan

                Hi Jana,


                Sorry about that, I had a slightly incorrect formula in the response above.

                 

                The correct formula should give the value C3 instead of D3 for the key.

                 

                Notice that A2 is just the value from C2 copied.  The calculation starts at A3.  I've fixed the post above too.

                 

                 

                Hope that helps!

                • 5. Re: Restructure multiple hurricane datasets with a header line (NOAA's HURDAT2 dataset)
                  Jana STUPAVSKY

                  Josh - brilliant! I made a few tweaks to the flow as I needed some other info in addition to what you proposed, but I really appreciate your help.