Frequency Calculation

2019-04-16

Introduction

This is a brief introduction to the functions in tidytransit that can be used to describe the frequency with which vehicles are scheduled to pass through routes and stops.

Assumptions of read_gtfs()

When determining what headways are, we have to decide which hours of the day and which schedules we are interested in.

Tidytransit makes a few key assumptions about headways in the high level read_gtfs function. That is that they are:

local_gtfs_path <- system.file("extdata", 
                               "google_transit_nyc_subway.zip", 
                               package = "tidytransit")
nyc <- read_gtfs(local_gtfs_path, 
                 local=TRUE,
                 geometry=TRUE,
                 frequency=TRUE)
#> Calculating route and stop headways.

You can also use the get_route_frequency function directly, which takes arguments for other windows of time, and for other schedules. See the get_route_frequency documentation in the reference for more detail.

Plotting Stop Headways

Before we plot headways at stops, we must join the frequency table to the geometries for the stops.

some_stops_freq_sf <- nyc$.$stops_sf %>%
  left_join(nyc$.$stops_frequency, by="stop_id") %>%
  select(headway)

Then we can plot them.

plot(some_stops_freq_sf)

We will see some outliers for headway calculations in this plot.

In the NYC MTA schedule, for a few stops, a train will only show up a few times a day. Since we are calculating headways, by default, for a period from 6 am to 10 pm, the average headway for these stops will be as high as hundred of minutes.

One quick solution to the outlier stops in above plot is to throw out stops with headways greater than an unreasonable amount of time. For example, we can filter out stops with headways above 60 minutes.

some_stops_freq_sf <- some_stops_freq_sf %>%
  filter(headway<60)
plot(some_stops_freq_sf)

If you’re interested in how to work with schedules and outlier stops like this, the timetable vignette, included in this package, is a great introduction.

Route Frequencies

To calculate frequency at the route level, by default, tidytransit summary statistics of the calculated headways at stops along each route. So the median headways for stops along each route do some work to throw out outliers stop headways.

One way to verify that the frequency and service calculations from GTFS are accurate is by checking against other sources, such as the train schedules themselves.

Above, we see that the median headway for the 1 train from 6 AM to 10 PM is 5 minutes according to our calculations. According to the wikipedia entry for the NYC MTA this seems about right.

Getting Frequencies for A Specific Day and Time

We might also want to check what rush hour headways are like on a specific day. The set_hms_times and set_date_service_table functions will alter the feed for us, allowing us to filter by date.

Below we pull a service ID for a specific weekday (2018-08-23).

(See the servicepatterns and timetable vignette for more advice on schedule filtering.)

Then we calculate the route frequency in the afternoon rush hour.

Again, the median headways for the 1 train seem to roughly correspond (1 min off) to those published on wikipedia entry

Mapping Route Frequency

We can also plot the custom headways above along a set of route shapes above.

We can do this by joining to the simple features routes table.