Can you clarify for me which part of the process you want some help with? Is it that you want to edit the original input file to add new data (in that case, you would download the .dat file from OpenFlights Database that was referenced in the blog post), and then either use one of the Python scripts linked from the post to process the data and write a KML file or a TDE with the great circle paths.
If it is something else, let me know and I'll see if I can help.
I would ideally like to learn how to do it completely from scratch as I am always looking to learn. I have downloaded Python and am raring to go, however I don't know which files I need to download from OpenFlights in order to get started and complete the entire tutorial. I am wanting to replicate exactly what you created so excellently, but with my own specific routes added in. If you could give me an ingredients list and a step-by-step recipe that would be awesome. I have a few ideas that I am going to try and if I get them to work I'll happily share them around.
The ideal step-by-step tutorial for me would explain:
- Which exact .dat files from https://openflights.org/data.html I need to download
- Which exact bits of Python I need to download and how to install/use them. This is probably the hardest bit for me as I have never used Python before and am frantically googling and reading up about it.
- How I should go about joining the tables to get the data to look like flightsoftheworld.csv
- How to go about creating the KML file inside of python.
I would like to end up with a carbon-copy of your visualisation and then I am planning on editing the original .dat files to add new airports and new routes and do the whole process again for practice.
Sorry to pester as I fully appreciate this is a voluntary forum that takes up your time, but are you able to update/expand on this a little?
And is the private messaging feature still restricted to people that follow each other?
Sorry for my slowness in responding -
the dat file that we used was the airports.dat file - just the airport locations. You can also grab the flights of the world csv file that I used in my script here since I have it linked to the repository with my python script. If your data looks just like that file, the script should run the same.
For the python files - you have two options... both will give you the same results in terms of vertices rendered in the great circle arcs. If you look at the scripts for both you'll see a lot of similarities in how the great circle arcs are created. The big difference is that
- One of the files will give you a KML file - this will be a larger file size, but will look the same as the one that writes directly to a TDE
- One of the files will give you a TDE file - if you use this script you'll need to also have the Tableau SDK installed (Tableau SDK-Installing). The advantage here is that everything goes straight into a Tableau extract and can use the compression for our spatial data type, so it's much smaller.
As for running the scripts - you can read and edit the python scripts with any text editor and then run them from the command line if you want. But...if you've never worked with Python before, I recommend downloading an IDE that will give you an easier way to debug and edit the scripts.
Personally, I use pycharm most of the time (PyCharm: Python IDE for Professional Developers by JetBrains ) and find one of the easiest ways to get started with Python (assuming you aren't already working with it at all) is to just install the Anaconda distribution ( https://www.anaconda.com/download/ ) because it will come with a number of useful libraries and applications (like the Spyder IDE, another option for writing / running / debugging your Python scripts).
The main thing is first get Python installed, then open up the scripts in your IDE of choice and take a look to see how they work. then run them on the flightsoftheworld.csv file (linked above), and when you have that working you should try adding in your own data using the same format.
I hope this helps you get started.
Hi Sarah Battersby, I am a complete newbie with Python, so this could be a really interesting challenge...
I have downloaded and installed where applicable:
- the airports.dat and flightsoftheworld.csv (I'm not planning on tweaking this at the moment as I'd like to make sure I am doing the right steps in their entirety first)
- Python 3.6
- Anaconda for version 3.6
I have managed to configure Python as my interpreter and I downloaded and installed geographiclib, lxml and pandas in PyCharm and then I attempted to run the script. The script has given me the following error...
Line 41 to 54 look like this:
My csv file file path is: C:\Users\tkorynek\Desktop\Tableau - Trial\ForForum\FlightRoutesOfTheWorldIssue270418\flightsoftheworld.csv
My Python script file is: C:\Users\tkorynek\Desktop\Tableau - Trial\ForForum\FlightRoutesOfTheWorldIssue270418\geodesiccalc.py
How do I go about fixing this?
For that particular error, take a look at the indent on line 42. It looks like the line starting with 'placemark' has an extra space at the beginning of the line. Python is very particular about indents in the code.
You might also want to take a look through the Python.org beginner's guide for some tips on ramping up with Python. I think they have some great tutorials to help you out.
Awesome, removing the extra space made it work and produce a kml file.
I have been playing around the with Flightsoftheworld.csv file and noticed there are 29633 duplicates out of 66066 records for flights. Is there a reason why AER - KZN is recorded only once, but DME - KZN is recorded 4 times?
HKT - BKK is in a total of 13 times
Am I supposed to remove these duplicates? Are they critical to the construction of the lines? Should I remove them as they are unnecessarily bloating the dataset?
Would you also be able to share the steps you took to join the kml and .csv files together? I think I have messed something up in my kml file, but it could be down to the joining.
When I create a workbook in Tableau with just my version of .kml file, I drag Longitude (generated) and Latitude (generated) to Columns and Rows respectively and all it appears with is 1 null. I looked at what values the filter would allow me to select on Longitude and Latitude, and there were no values to choose from other than null. I then dragged COLLECT(Geometry) to the details and it gave me the following:
As my version of the flightsoftheworld.csv is now much longer at about 90,000 records, I have a much bigger .kml file of 200mb as opposed to the 66mb which your flightsoftheworld.csv produces, I cannot attach it to this post.
I tried the exact same methods on your .csv of flightsoftheworld.csv using the 66mb .kml file and also could not get Tableau to give me the something meaningful...
Is there any information I can provide you with that might help you to help me troubleshooting this?
Can you attach your flights of the world.csv file? I assume with only 90k rows that would be small enough to attach. I can generate a KML or TDE to test out in Tableau using that.
As I have been using the Python script to produce just the .kml file as opposed to directly going to the TDE, maybe it is something I don't have setup incorrectly in Python?
Looks pretty awesome that you have managed to get it to work, which makes me pretty hopeful that my data should work in the Python script to .kml. Could you share all the steps you took to get your TDE (or what would be my .kml file, into what you posted in the image above?
Thanks for your continued assistance,
Step-by-step, all I did was open the Python file (my script here), update the one line that indicates the input source (line 16 when I opened the file: "csvLocation = 'data/flightsoftheworld.csv'"), and then run the script in PyCharm.
The most important part is to make sure that you have the correct libraries installed. The top of the .py file lists the libraries that you will need to have installed - tableausdk, geographiclib, and csv. (Installing Packages — Python Packaging User Guide)