Exploring Basic Functionality with Kansas City, Missouri’s B-Cycle

Simon P. Couch

2018-05-09

In this example, we’ll load gbfs into the environment and then explore some of the basic functionality of the package using Kansas City, Missouri’s B-Cycle program as an example.

devtools::load_all()
#> Loading gbfs

Note: Throughout this vignette I use a temporary directory by calling withr::with_dir(tempdir(), ...) as this is a vignette and the codes are examples. In real life, we would skip the temporary directory stuff; just focus on the ... for the purpose of this tutorial.

We’ll start by using the get_gbfs function–this is generally a good starting place. get_gbfs can find and grab all of the feeds for a specified city and save them in a folder as tidy .rds datasets. For the city argument, supply get_gbfs with either the url to KCMO’s gbfs.json file (at https://gbfs.bcycle.com/bcycle_kc/gbfs.json) or a character string with the name of the city. Here, we’ll use the name of the city. Notice we did not supply an argument for feeds, so the function will default to grabbing all of the feeds that it finds exists.

withr::with_dir(tempdir(), get_gbfs(city = "kansas city", directory = "kc_bikeshare"))
withr::with_dir(tempdir(), dir("kc_bikeshare"))
#> [1] "station_information.rds"  "station_status.rds"      
#> [3] "system_information.rds"   "system_pricing_plans.rds"
#> [5] "system_regions.rds"

Here, it looks like Kansas City, in addition to the required .json feeds, supplies system_pricing_plans and system_regions. The only ‘dynamic’ feed, (i.e. feed that changes regularly) supplied is the required station_status. Let’s take a look at station_status.

str(station_status)
#> 'data.frame':    42 obs. of  12 variables:
#>  $ station_id         : chr  "bcycle_kc_2073" "bcycle_kc_2074" "bcycle_kc_2075" "bcycle_kc_2076" ...
#>  $ num_bikes_available: int  4 4 4 7 8 6 12 8 5 7 ...
#>  $ num_docks_available: int  7 5 10 7 7 4 2 4 6 4 ...
#>  $ is_installed       : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
#>  $ is_renting         : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
#>  $ is_returning       : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
#>  $ last_updated       : POSIXct, format: "2018-05-09 06:23:29" "2018-05-09 06:23:29" ...
#>  $ year               : num  2018 2018 2018 2018 2018 ...
#>  $ month              : num  5 5 5 5 5 5 5 5 5 5 ...
#>  $ day                : int  9 9 9 9 9 9 9 9 9 9 ...
#>  $ hour               : int  6 6 6 6 6 6 6 6 6 6 ...
#>  $ minute             : int  23 23 23 23 23 23 23 23 23 23 ...

nrow(station_status)
#> [1] 42

Here, we see that there are currently 42 rows in station status. There are also plenty of interesting columns, such as the station ID, the number of bikes available currently, etc. As I write this, a few minutes have passed since I called get_gbfs. We can now call get_gbfs again, with slightly different arguments, so that we can append rows to the station_status dataset but keep the static files unchanged.

withr::with_dir(tempdir(), get_gbfs(city = "kansas city", 
                                    directory = "kc_bikeshare", 
                                    feeds = "dynamic"))

Reading the dataset in again and calling nrow, we can see that the number of rows doubled.

station_status <- withr::with_dir(tempdir(), readRDS("./kc_bikeshare/station_status.rds"))

nrow(station_status)
#> [1] 84

In the dataset, there are now two entries for each station—running this function many times over time could allow for the tracking of many interesting variables of time. In a few lines, there is already plenty to explore.

In combination with cron or task scheduler, depending on your operating system, this set of functions can be used to quickly build up informative datasets about bikeshare programs in a user’s town of interest.