The goal of the climate R package is to automatize downloading of meteorological and hydrological data from publicly available repositories:
The climate package consists of eigth main functions - three for meteorological data, one for hydrological data and four auxiliary functions and datasets:
meteo_ogimet() - Downloading hourly and daily meteorological data from the SYNOP stations available in the ogimet.com collection. Meteorological (aka Synop) station working under the World Meteorological Organizaton framework after year 2000 should be accessible.
sounding_wyoming() - Downloading radiosonde data for any rawinsonde station in the world (i.e., vertical profiles of the atmosphere) from the Wyoming University repository
meteo_imgw() - A generic function for downloading hourly, daily and monthly dataset, from the IMGW-PIB repository. It is a wrapper for meteo_monthly()
, meteo_daily()
, and meteo_hourly()
from imgw package.
hydro_annual()
, hydro_monthly()
, and hydro_daily()
from imgw package.stations_ogimet() - Downloading information about all stations available in the selected country in the Ogimet repository
nearest_stations_ogimet - Downloading information about nearest stations to the selected point available in the selected country in the Ogimet repository
imgw_meteo_stations - Built-in metadata from the IMGW-PIB repository for meteorological stations, their geographical coordinates, and ID numbers
imgw_hydro_stations - Built-in metadata from the IMGW-PIB repository for hydrological stations, their geographical coordinates, and ID numbers
imgw_meteo_abbrev - Dictionary explaining variables available for meteorological stations (from the IMGW-PIB repository)
imgw_hydro_abbrev - Dictionary explaining variables available for hydrological stations (from the IMGW-PIB repository)
We will show how to use our package and prepare the data for spatial analysis with the additional help of the dplyr and tidyr packages. Firstly, we download ten years (2001-2010) of monthly hydrological observations for all stations available and automatically add their spatial coordinates.
h = hydro_imgw(interval = "monthly", year = 2001:2010, coords = TRUE)
head(h)
#> id X Y station riv_or_lake hyy idhyy idex H
#> 95158 150210180 21.8335 50.88641 ANNOPOL Wisła (2) 2001 1 1 214
#> 95159 150210180 21.8335 50.88641 ANNOPOL Wisła (2) 2001 1 2 228
#> 95160 150210180 21.8335 50.88641 ANNOPOL Wisła (2) 2001 1 3 250
#> 95161 150210180 21.8335 50.88641 ANNOPOL Wisła (2) 2001 2 1 215
#> 95162 150210180 21.8335 50.88641 ANNOPOL Wisła (2) 2001 2 2 225
#> 95163 150210180 21.8335 50.88641 ANNOPOL Wisła (2) 2001 2 3 258
#> Q T mm
#> 95158 172 NA 11
#> 95159 207 NA 11
#> 95160 272 NA 11
#> 95161 174 NA 12
#> 95162 201 NA 12
#> 95163 297 NA 12
The idex
variable represents id of the extremum, where 1
means minimum, 2
mean, and 3
maximum.1 Hydrologists often use the maximum value so we will filter the data and select only the station id
, hydrological year (hyy
), latitude X
and longitude Y
. Next, we will calculate the mean maximum value of the flow on the stations in each year with dplyr’s summarise()
, and spread data by year using tidyr’s spread()
to get the annual means of maximum flow in the consecutive columns.
h2 = h %>%
filter(idex == 3) %>%
select(id, station, X, Y, hyy, Q) %>%
group_by(hyy, id, station, X, Y) %>%
summarise(annual_mean_Q = round(mean(Q, na.rm = TRUE), 1)) %>%
tidyr::pivot_wider(names_from = hyy, values_from = annual_mean_Q)
id | station | X | Y | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
149180010 | KRZYŻANOWICE | 18.28780 | 49.99301 | 200.5 | 147.4 | 87.9 | 109.2 | 170.6 | 226.9 | 152.9 | 131.0 | 160.9 | 461.1 |
149180020 | CHAŁUPKI | 18.32752 | 49.92127 | 174.7 | 96.7 | 57.6 | 91.8 | 146.9 | 170.6 | 110.2 | 101.6 | 124.7 | 314.6 |
149180040 | GOŁKOWICE | 18.49640 | 49.92579 | 4.5 | 2.0 | 1.7 | 1.7 | 2.5 | 3.3 | 2.1 | 1.7 | 2.2 | 8.6 |
149180050 | ZEBRZYDOWICE | 18.61326 | 49.88025 | 13.5 | 7.9 | 3.8 | 5.0 | 10.4 | 6.5 | 5.8 | 2.8 | 4.5 | 23.6 |
149180060 | CIESZYN | 18.62972 | 49.74616 | 57.2 | 57.7 | 29.8 | 26.8 | 65.4 | 60.7 | 54.7 | 33.0 | 34.7 | 135.0 |
149180070 | CIESZYN | 18.63137 | 49.74629 | NaN | NaN | NaN | NaN | NaN | NaN | 0.6 | 0.5 | 0.6 | 0.6 |
The result represents changes in the annual maximum average of water flow rate over the decade for all available stations in Poland. We can save it to:
.csv
with: write.csv(result, file = "result.csv", sep = ";",dec = ".", col.names = TRUE, row.names = FALSE)
. This command saves our result to result.csv
where the column’s separator is ;
, the decimal is .
, we are keeping the headers of columns and remove names of rows.
.xlsx
with: write.xlsx(result, file = "result.xlsx", sheetName = "Poland", append = FALSE)
This command saves our result to result.xlsx with the name of the sheet Poland
. Argument append = TRUE
adds the sheet to already existing xlsx
file. To save data in .xlsx
you have first to install the writexl package with command: install.packages("writexl")
, and add it: library(writexl)
.
The annual means of maximum flow can be also presented on the map using the tmap package:
library(sf)
library(tmap)
library(rnaturalearth)
library(rnaturalearthdata)
world = ne_countries(scale = "medium", returnclass = "sf")
h3 = h2 %>%
filter(!is.na(X)) %>%
st_as_sf(coords = c("X", "Y"))
tm_shape(h3) +
tm_symbols(size = as.character(c(2001:2010)),
title.size = "The annual means of maximum flow") +
tm_facets(free.scales = FALSE, ncol = 4) +
tm_shape(world) +
tm_borders(col = "black", lwd = 2) +
tm_layout(legend.position = c(-1.25, 0.05),
outer.margins = c(0, 0.05, 0, -0.25),
panel.labels = as.character(c(2001:2010)))
You can find more information about this in the imgw_hydro_abbrev
dataset.↩