> Reshaping the data outside of Tableau is not practical.
So reshape it on the way in.
I can't see how you can do it without two rows per journey. But it just needs a simple custom SQL connection, as per the attached.
I tried to think of a smart way to highlight the journeys that ended back at the start station (which just show as a dot), but couldn't really think of any. In practice I suppose you may have journeys that start but never end (i.e. in progress or stolen bikes), so they would appear as small dots with the way I have it set up, whereas start and end at the same place would show as large dots. Journeys that end but never start would surely be a data issue, I think.
I experimented with different ways of showing journey direction, the size shelf seemed the best bet (the other options I tried were start and end labels and Station Id instead of Journey Id on colour). The only trouble with the size shelf is that the labels for journeys that start and end at the same station make you think you've gone cross-eyed.
Even with just your sample data it gets very busy with all stations labelled, so I have it set to label on highlight.
CycleHire.twbx 49.0 KB
Thanks for the reply. Of course, I forgot the first rule in providing sample data - making it the same format as the real data. My real data is actually 2 CSV files. One has the journey data, and the other has the Station Latitude/Longitudes. I use data blending to match them up.
I reworked your connection custom SQL to fit my data. That is fine. I had to manually create Lat/Long calculated fields, in the same way Joe did in the Lanes thread.
However - one more problem. In my sample data, and Joe's example, everything's in one connection. In my real data, to make valid Lat/Long calculations, I need to use aggregate functions (cos that's Tableau's rule). So, my calc field looks like:
IIF(ATTR([Type])="Start", AVG([Station locations (start)].[Latitude (start)]),AVG([Station locations (end)].[Latitude (end)]))
That makes Tableau happy, but not me.... I don't want the aggregates. Harumph. Maybe I do have to reshape the data?
I'm not sure I understand what your problem is. Actually, I am sure I don't understand what your problem is. ;-)
So you have 2 CSV files, presumably structured something like this:
Start Station Id
End Station Id
Or maybe the start and end station names are in the first file?
But that doesn't seem to agree with your calculated field. You reference table names of [Station locations (start)] and [Station locations (end)] - or is that just a typo in writing the explanation?
Anyway, how it sounds as if your data would make sense to be structured ;-) ought to work fine either with custom SQL to split Journey into 2 rows (and join both start and end to Station) or with data blending on the two Station Id fields, as it sounds as if you've got.
What is the issue with the aggregates? Blending like that should just be picking up a single Station, so it's a single row. You could use ATTR() - so at least it will make it obvious if for some reason you are blending multiple values.
It looks like this was used for:
Can you link the the two CSV files that you mentions in the previous comment?
While I understand the need for two rows per journey, or custom SQL, could you have avoided linking the two tables by using geocoding (not on Tab Public, I know)
I am also interested in doing something similar. I have retailer zip codes and customer zip codes. I want to be able to map the movement of appliances between these two areas, ideally without having to plot any latitude and longitude info (instead using the built in latitude/longitude information that Tableau contains for zip codes)
Each line has a field for:
Retailer Zip Code
Customer Zip Code
I want to be able to to see a line, the thickness of which represents the aggregate number of units, which travels from each retailer zip code to each customer zip code.
Is this possible?
So if your source data looked like:
Appliance Retailer Zip Code Customer Zip Code A 32046 31733 B 32046 31811 C 32046 32003
You would want to reshape it so it is more like:
Appliance Type Zip A Retailer 32046 A Customer 31733 B Retailer 32046 B Customer 31811 C Retailer 32046 C Customer 32003
(you may also need to do additional things to ensure that rows are paired together)
For some more details see: http://www.tableausoftware.com/support/knowledge-base/using-path-shelf-pattern-analysis
You can do this transformation with either an ETL process or some custom SQL.
If you have issues or questions, please post a sample data set that represents your situation.
Joe, thank you for your reply. I have been having trouble getting my paths to work correctly. I went to the link you provided and tried to format the data in the same way, but for some reason it is not turning out. I have attached the excel spreadsheet with my data and a packaged workbook. I would really appreciate any guidance as to why this is not working.
I always struggle with the path shelf, too.
It's working in the attached, by having wrapped [Zip Code (for path)] in an an aggregate function - MIN() in this case. That actually doesn't change the level of detail because it's already partitioned at the level of the individual line ends - but in some way that prevents Tableau from seeing it as a further level of partitioning which prevents the use of the path shelf.
Maybe Joe or one of the Tableau folk can explain that better - Joe definitely has his head around this much better than I do - he's tried explaining it to me repeatedly. Maybe one more time and it will sink in. ;-)
path_example_rl.twbx 26.2 KB
The easy way to think of the Level of Detail shelf is it will create more marks, a mark for each distinct dimension value combination. So with the Line mark type, placing a dimension field, blue non-aggregated pill, on the Level of Detail shelf will start a new line for each distinct value because the line is the mark. So if you place your Zip code field there non-aggregated, you will get a bunch of dots because you are telling Tableau to draw a new line for each zip code value.
What Richard did was turn it into a discreet measure, aggregating it with the MIN function (it is very interesting to note that the ATTR() function will not allow Tableau to generate the lat/lon coordinates), so the value is available to Tableau for use in geocoding the zip codes, but will not cause Tableau to start a new line for each distinct zip code value. How Tableau knows to get the zip code is your Path shelf has distinct values, but the Path shelf does not cause the lines to restart because Tableau is using it to create distinct dot and connect those dots in order based on the sort order of the pill on the Path shelf.
Hopefully that made sense.
Thanks Richard and Joe! Joe, I really appreciate your detailed response. Although it doesn't make complete sense to me, it's good to know that someone understands it. I'm content with the fact that the trick fixed the problem.