11 Replies Latest reply on Jun 28, 2011 1:43 PM by John Mogielnicki

    Showing paths when start/end are in one row

    Andy Cotgreave

      I have some data for cycle hire journeys in London. The Start and End locations are on the same row. I want to draw a path between the start and end. I've had a look at two threads that seemed to provide the answer: Lanes and Flight data

       

      However, they both appear to require a separate row for each start and end location. Is there any way to do this when the start and end locations are in the same row?

       

      I have the following fields in my data (a sample is attached):

      Journey Id

      Start Station Id

      Start Station

      Latitude (start)

      Longitude (start)

      End Station Id

      End Station

      Latitude (end)

      Longitude (end)

       

      Reshaping the data outside of Tableau is not practical.

       

      Cheers

      Andy

      Follow me on twitter.

      Read my Tableau blog at thedatastudio.co.uk

        • 1. Re: Showing paths when start/end are in one row
          Richard Leeke

          > Reshaping the data outside of Tableau is not practical.

           

          So reshape it on the way in.

           

          I can't see how you can do it without two rows per journey.  But it just needs a simple custom SQL connection, as per the attached.

           

          I tried to think of a smart way to highlight the journeys that ended back at the start station (which just show as a dot), but couldn't really think of any.  In practice I suppose you may have journeys that start but never end (i.e. in progress or stolen bikes), so they would appear as small dots with the way I have it set up, whereas start and end at the same place would show as large dots.  Journeys that end but never start would surely be a data issue, I think.

           

          I experimented with different ways of showing journey direction, the size shelf seemed the best bet (the other options I tried were start and end labels and Station Id instead of Journey Id on colour).  The only trouble with the size shelf is that the labels for journeys that start and end at the same station make you think you've gone cross-eyed.

           

          Even with just your sample data it gets very busy with all stations labelled, so I have it set to label on highlight.

           

          Cool data.

          • 2. Re: Showing paths when start/end are in one row
            Andy Cotgreave

            Richard

            Thanks for the reply. Of course, I forgot the first rule in providing sample data - making it the same format as the real data. My real data is actually 2 CSV files. One has the journey data, and the other has the Station Latitude/Longitudes. I use data blending to match them up.

             

            I reworked your connection custom SQL to fit my data. That is fine. I had to manually create Lat/Long calculated fields, in the same way Joe did in the Lanes thread.

             

            However - one more problem. In my sample data, and Joe's example, everything's in one connection. In my real data, to make valid Lat/Long calculations, I need to use aggregate functions (cos that's Tableau's rule). So, my calc field looks like:

             

            IIF(ATTR([Type])="Start", AVG([Station locations (start)].[Latitude (start)]),AVG([Station locations (end)].[Latitude (end)]))


             

            That makes Tableau happy, but not me.... I don't want the aggregates. Harumph. Maybe I do have to reshape the data?

            • 3. Re: Showing paths when start/end are in one row
              Richard Leeke

              I'm not sure I understand what your problem is.  Actually, I am sure I don't understand what your problem is.  ;-)

               

              So you have 2 CSV files, presumably structured something like this:

               

              Journey

              Journey Id

              Start Station Id

              End Station Id

               

              Station

              Station Id

              Station

              Latitude

              Longitude

               

              Or maybe the start and end station names are in the first file?

               

              But that doesn't seem to agree with your calculated field.  You reference table names of [Station locations (start)] and [Station locations (end)] - or is that just a typo in writing the explanation?

               

              Anyway, how it sounds as if your data would make sense to be structured ;-) ought to work fine either with custom SQL to split Journey into 2 rows (and join both start and end to Station) or with data blending on the two Station Id fields, as it sounds as if you've got.

               

              What is the issue with the aggregates?  Blending like that should just be picking up a single Station, so it's a single row.  You could use ATTR() - so at least it will make it obvious if for some reason you are blending multiple values.

              • 4. Re: Showing paths when start/end are in one row
                Joe Mako

                It looks like this was used for:

                http://www.thedatastudio.co.uk/blog/the-data-studio-blog/andy-cotgreave/boris-bikes-1-million

                 

                Can you link the the two CSV files that you mentions in the previous comment?

                • 5. Re: Showing paths when start/end are in one row
                  Alex Kerin

                  While I understand the need for two rows per journey, or custom SQL, could you have avoided linking the two tables by using geocoding (not on Tab Public, I know)

                  • 6. Re: Showing paths when start/end are in one row
                    John Mogielnicki

                    I am also interested in doing something similar. I have retailer zip codes and customer zip codes. I want to be able to map the movement of appliances between these two areas, ideally without having to plot any latitude and longitude info (instead using the built in latitude/longitude information that Tableau contains for zip codes)

                     

                    Each line has a field for:

                    Appliance

                    Retailer Zip Code

                    Customer Zip Code

                     

                    I want to be able to to see a line, the thickness of which represents the aggregate number of units, which travels from each retailer zip code to each customer zip code.

                     

                    Is this possible?

                    • 7. Re: Showing paths when start/end are in one row
                      Joe Mako

                      So if your source data looked like:

                       

                       

                      Appliance    Retailer Zip Code    Customer Zip Code
                      
                       A    32046    31733
                      B    32046    31811
                      C    32046    32003
                      
                      


                       

                      You would want to reshape it so it is more like:

                       

                       

                      Appliance    Type    Zip
                      
                       A    Retailer    32046
                      A    Customer    31733
                      B    Retailer    32046
                      B    Customer    31811
                      C    Retailer    32046
                      C    Customer    32003
                      


                       

                      (you may also need to do additional things to ensure that rows are paired together)

                       

                      For some more details see: http://www.tableausoftware.com/support/knowledge-base/using-path-shelf-pattern-analysis

                       

                      You can do this transformation with either an ETL process or some custom SQL.

                       

                      If you have issues or questions, please post a sample data set that represents your situation.

                      • 8. Re: Showing paths when start/end are in one row
                        John Mogielnicki

                        Joe, thank you for your reply.  I have been having trouble getting my paths to work correctly.  I went to the link you provided and tried to format the data in the same way, but for some reason it is not turning out.  I have attached the excel spreadsheet with my data and a packaged workbook.  I would really appreciate any guidance as to why this is not working.

                         

                        Thanks!

                        • 9. Re: Showing paths when start/end are in one row
                          Richard Leeke

                          I always struggle with the path shelf, too.

                           

                          It's working in the attached, by having wrapped [Zip Code (for path)] in an an aggregate function - MIN() in this case.  That actually doesn't change the level of detail because it's already partitioned at the level of the individual line ends - but in some way that prevents Tableau from seeing it as a further level of partitioning which prevents the use of the path shelf.

                           

                          Maybe Joe or one of the Tableau folk can explain that better - Joe definitely has his head around this much better than I do - he's tried explaining it to me repeatedly.  Maybe one more time and it will sink in.  ;-)

                          • 10. Re: Showing paths when start/end are in one row
                            Joe Mako

                            The easy way to think of the Level of Detail shelf is it will create more marks, a mark for each distinct dimension value combination. So with the Line mark type, placing a dimension field, blue non-aggregated pill, on the Level of Detail shelf will start a new line for each distinct value because the line is the mark. So if you place your Zip code field there non-aggregated, you will get a bunch of dots because you are telling Tableau to draw a new line for each zip code value.

                             

                            What Richard did was turn it into a discreet measure, aggregating it with the MIN function (it is very interesting to note that the ATTR() function will not allow Tableau to generate the lat/lon coordinates), so the value is available to Tableau for use in geocoding the zip codes, but will not cause Tableau to start a new line for each distinct zip code value. How Tableau knows to get the zip code is your Path shelf has distinct values, but the Path shelf does not cause the lines to restart because Tableau is using it to create distinct dot and connect those dots in order based on the sort order of the pill on the Path shelf.

                             

                            Hopefully that made sense.

                            • 11. Re: Showing paths when start/end are in one row
                              John Mogielnicki

                              Thanks Richard and Joe!  Joe, I really appreciate your detailed response.  Although it doesn't make complete sense to me, it's good to know that someone understands it.  I'm content with the fact that the trick fixed the problem.