Introduction
Vienna is considered to have an excellent public transport system, and one of the talking points is the accessibility of some form of public transport or the other in most areas of the city. In this project, I probe this aspect by creating a exhaustive dataset of (all!) addresses in Vienna along with the closest public transport by type (U-bahn, tram, bus, night bus) and the walking distance to this closest stop.
To implement this, I used the following datasets and tools:
- GTFS feeds of Wiener Linien (and Wiener Lokalbahnen) from transit.land, which contain details about the various public transport stops, coordiantes, routes and schedules.
- Adressen Standorte Wien from Cooperation OGD Österreich, which gives details of addresses in Vienna and their coordinates.
- A local Docker-installation of openrouteservice to query walking distances between addresses and public transport stops.
- R Shiny to create an interactive app allowing users to change district, public transport type, and distance threshold to see which areas of the district have public transport access inside the selected distance threshold.
- An AWS (Amazon Web Services) EC2 instance to deploy the app and embed here.
Interactive with R Shiny
Speeding up computations
There are a lot of coordinates for each district, making the task of finding the areas computationally heavy. So currently, I’m sampling addresses from the district such that there aren’t too many addresses too close to each other (they have a high probability of having the same closest public transport). This is one of the reasons why you might notice that some areas that ought to be covered sometimes aren’t. I also use st_simplify
with a large dTolerance
and st_buffer
with a large distance value. This speeds up the process of finding the clouds of areas from a large set of coordinates. This, however, is still slow for large districts like 1020, 1200, 1230, and also for large distance thresholds. Next, I’d like to try other methods of speeding up this computation. Some options are to pre-compute the areas for different thresholds using DBSCAN
, a density-based clustering algorithm, and Shapely
’s unary_union
.
Code
The code is available here on GitHub.