The “forecastHybrid” package provides functions to build composite models using multiple individual component models from the “forecast” package. These hybridModel
objects can then be manipulated with many of the familiar functions from the “forecast” and “stats” packages including forecast()
, plot()
, accuracy()
, residuals()
, and fitted()
.
The stable release of the package is hosted on CRAN and can be installed as usual.
install.packages("forecastHybrid")
The latest development version can be installed using the “devtools” package.
devtools::install_github("ellisp/forecastHybrid/pkg")
Version updates to CRAN will be published frequently after new features are implemented, so the development version is not recommended unless you plan to modify the code.
First load the package.
library(forecastHybrid)
If you don’t have time to read the whole guide and want to get started immediatly with sane default settings to forecast the AirPassengers
timeseries, run the following:
quickModel <- hybridModel(AirPassengers)
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the thetam model
## Fitting the nnetar model
## Fitting the stlm model
## Fitting the tbats model
forecast(quickModel)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1961 447.0131 420.9284 477.3570 409.9071 486.2706
## Feb 1961 430.6853 402.5180 466.6379 392.8622 481.6784
## Mar 1961 479.2503 427.8241 544.1361 416.7423 564.7432
## Apr 1961 487.2129 445.6985 533.5071 425.6737 556.3466
## May 1961 495.6586 443.0128 544.5004 420.7317 570.1855
## Jun 1961 560.3764 498.8558 620.1011 471.1949 651.8416
## Jul 1961 633.2076 550.5136 702.4360 517.2336 741.0152
## Aug 1961 626.4402 543.3613 701.4769 507.8542 742.4204
## Sep 1961 538.9648 468.8133 619.3594 435.9204 657.5609
## Oct 1961 479.4878 405.1625 543.0011 374.8081 578.1676
## Nov 1961 417.1395 349.4447 479.7441 321.6167 512.2342
## Dec 1961 460.9900 388.8923 540.9447 356.0996 579.1202
## Jan 1962 475.9583 392.2920 560.6377 357.3798 601.6793
## Feb 1962 459.5545 381.4061 552.5367 345.6841 594.4137
## Mar 1962 509.7235 424.9010 640.5731 391.0899 690.7625
## Apr 1962 517.8682 418.8586 625.1692 375.7208 675.7075
## May 1962 526.9912 416.8132 635.6637 371.9359 688.5562
## Jun 1962 594.3481 469.7070 721.6076 416.9301 783.3187
## Jul 1962 669.5856 518.5862 815.1471 457.8729 886.6945
## Aug 1962 662.6776 511.9731 812.0699 449.6093 885.0996
## Sep 1962 572.1235 441.7638 715.4365 385.8483 781.3089
## Oct 1962 510.0369 381.7623 626.0192 331.6121 684.9419
## Nov 1962 445.4028 329.2070 552.1130 284.3727 605.1918
## Dec 1962 490.0511 366.2764 621.5349 314.6143 682.5177
plot(forecast(quickModel), main = "Forecast from auto.arima, ets, thetam, nnetar, stlm, and tbats model")
The workhorse function of the package is hybridModel()
, a function that combines several component models from the “forecast” package. At a minimum, the user must supply a ts
or numeric
vector for y
. In this case, the ensemble will include all six component models: auto.arima()
, ets()
, thetam()
, nnetar()
, stlm()
, and tbats()
. To instead use only a subset of these models, pass a character string to the models
argument with the first letter of each model to include. For example, to build an ensemble model on the gas
dataset with auto.arima()
, ets()
, and tbats()
components, run
# Build a hybrid forecast on the gas dataset using auto.arima, ets, and tbats models.
# Each model is given equal weight
hm1 <- hybridModel(y = gas, models = "aet", weights = "equal")
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the tbats model
The individual component models are stored inside the hybridModel
objects and can viewed in their respective slots, and all the regular methods from the “forecast” package could be applied to these individual component models.
# View the individual models
hm1$auto.arima
## Series: structure(c(1709, 1646, 1794, 1878, 2173, 2321, 2468, 2416, 2184, 2121, 1962, 1825, 1751, 1688, 1920, 1941, 2311, 2279, 2638, 2448, 2279, 2163, 1941, 1878, 1773, 1688, 1783, 1984, 2290, 2511, 2712, 2522, 2342, 2195, 1931, 1910, 1730, 1688, 1899, 1994, 2342, 2553, 2712, 2627, 2363, 2311, 2026, 1910, 1762, 1815, 2005, 2089, 2617, 2828, 2965, 2891, 2532, 2363, 2216, 2026, 1804, 1773, 2015, 2089, 2627, 2712, 3007, 2880, 2490, 2237, 2205, 1984, 1868, 1815, 2047, 2142, 2743, 2775, 3028, 2965, 2501, 2501, 2131, 2015, 1910, 1868, 2121, 2268, 2690, 2933, 3218, 3028, 2659, 2406, 2258, 2057, 1889, 1984, 2110, 2311, 2785, 3039, 3229, 3070, 2659, 2543, 2237, 2142, 1962, 1910, 2216, 2437, 2817, 3123, 3345, 3112, 2659, 2469, 2332, 2110, 1910, 1941, 2216, 2342, 2923, 3229, 3513, 3355, 2849, 2680, 2395, 2205, 1994, 1952, 2290, 2395, 2965, 3239, 3608, 3524, 3018, 2648, 2363, 2247, 1994, 1941, 2258, 2332, 3323, 3608, 3957, 3672, 3155, 2933, 2585, 2384, 2057, 2100, 2458, 2638, 3292, 3724, 4652, 4379, 4231, 3756, 3429, 3461, 3345, 4220, 4874, 5064, 5951, 6774, 7997, 7523, 7438, 6879, 6489, 6288, 5919, 6183, 6594, 6489, 8040, 9715, 9714, 9756, 8595, 7861, 7753, 8154, 7778, 7402, 8903, 9742, 11372, 12741, 13733, 13691, 12239, 12502, 11241, 10829, 11569, 10397, 12493, 11962, 13974, 14945, 16805, 16587, 14225, 14157, 13016, 12253, 11704, 12275, 13695, 14082, 16555, 17339, 17777, 17592, 16194, 15336, 14208, 13116, 12354, 12682, 14141, 14989, 16159, 18276, 19157, 18737, 17109, 17094, 15418, 14312, 13260, 14990, 15975, 16770, 19819, 20983, 22001, 22337, 20750, 19969, 17293, 16498, 15117, 16058, 18137, 18471, 21398, 23854, 26025, 25479, 22804, 19619, 19627, 18488, 17243, 18284, 20226, 20903, 23768, 26323, 28038, 26776, 22886, 22813, 22404, 19795, 18839, 18892, 20823, 22212, 25076, 26884, 30611, 30228, 26762, 25885, 23328, 21930, 21433, 22369, 24503, 25905, 30605, 34984, 37060, 34502, 31793, 29275, 28305, 25248, 27730, 27424, 32684, 31366, 37459, 41060, 43558, 42398, 33827, 34962, 33480, 32445, 30715, 30400, 31451, 31306, 40592, 44133, 47387, 41310, 37913, 34355, 34607, 28729, 26138, 30745, 35018, 34549, 40980, 42869, 45022, 40387, 38180, 38608, 35308, 30234, 28801, 33034, 35294, 33181, 40797, 42355, 46098, 42430, 41851, 39331, 37328, 34514, 32494, 33308, 36805, 34221, 41020, 44350, 46173, 44435, 40943, 39269, 35901, 32142, 31239, 32261, 34951, 38109, 43168, 45547, 49568, 45387, 41805, 41281, 36068, 34879, 32791, 34206, 39128, 40249, 43519, 46137, 56709, 52306, 49397, 45500, 39857, 37958, 35567, 37696, 42319, 39137, 47062, 50610, 54457, 54435, 48516, 43225, 42155, 39995, 37541, 37277, 41778, 41666, 49616, 57793, 61884, 62400, 50820, 51116, 45731, 42528, 40459, 40295, 44147, 42697, 52561, 56572, 56858, 58363, 45627, 45622, 41304, 36016, 35592, 35677, 39864, 41761, 50380, 49129, 55066, 55671, 49058, 44503, 42145, 38698, 38963, 38690, 39792, 42545, 50145, 58164, 59035, 59408, 55988, 47321, 42269, 39606, 37059, 37963, 31043, 41712, 50366, 56977, 56807, 54634, 51367, 48073, 46251, 43736, 39975, 40478, 46895, 46147, 55011, 57799, 62450, 63896, 57784, 53231, 50354, 38410, 41600, 41471, 46287, 49013, 56624, 61739, 66600, 60054), .Tsp = c(1956, 1995.58333333333, 12), class = "ts")
## ARIMA(2,1,1)(1,0,0)[12]
##
## Coefficients:
## ar1 ar2 ma1 sar1
## 0.5117 0.1824 -0.9638 0.8478
## s.e. 0.0502 0.0498 0.0134 0.0277
##
## sigma^2 estimated as 3201509: log likelihood=-4236.9
## AIC=8483.81 AICc=8483.94 BIC=8504.63
# See forecasts from the auto.arima model
plot(forecast(hm1$auto.arima))
The hybridModel()
function produces an S3 object of class forecastHybrid
.
class(hm1)
## [1] "hybridModel"
is.hybridModel(hm1)
## [1] TRUE
The print()
and summary()
methods print information about the ensemble model including the weights assigned to each individual component model.
print(hm1)
## Hybrid forecast model comprised of the following models: auto.arima, ets, tbats
## ############
## auto.arima with weight 0.333
## ############
## ets with weight 0.333
## ############
## tbats with weight 0.333
summary(hm1)
## Hybrid forecast model comprised of the following models: auto.arima, ets, tbats
## ############
## auto.arima with weight 0.333
## ############
## ets with weight 0.333
## ############
## tbats with weight 0.333
Two types of plots can be created for the created ensemble model: either a plot showing the actual and fitted value of each component model on the data or individual plots of the component models as created by their regular S3 plot()
methods. Note that a plot()
method does not exist in the “forecast” package for objects generated with stlm()
, so this component model will be ignored when type = "models"
, but the other component models will be plotted regardless.
plot(hm1, type = "fit")
plot(hm1, type = "models")
By default each component model is given equal weight in the final ensemble. Empirically this has been shown to give good performance in ensembles [see @Armstrong2001], but alternative combination methods are available: the inverse root mean square error (RMSE
), inverse mean absolute error (MAE
), and inverse mean absolute scaled error (MASE
). To apply one of these weighting schemes of the component models, pass this value to the errorMethod
argument and pass either "insample.errors"
or "cv.errors"
to the weights
argument.
hm2 <- hybridModel(wineind, weights = "insample.errors", errorMethod = "MASE", models = "aenst")
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the nnetar model
## Fitting the stlm model
## Fitting the tbats model
hm2
## Hybrid forecast model comprised of the following models: auto.arima, ets, nnetar, stlm, tbats
## ############
## auto.arima with weight 0.062
## ############
## ets with weight 0.064
## ############
## nnetar with weight 0.647
## ############
## stlm with weight 0.081
## ############
## tbats with weight 0.146
After the model is fit, these weights are stored in the weights
attribute of the model. The user can view and manipulated these weights after the fit is complete. Note that the hybridModel()
function automatically scales weights to sum to one, so a user should similar scale the weights to ensure the forecasts remain unbiased. Furthermore, the vector that replaces weights
must retain names specifying the component model it corresponds to since weights are not assigned by position but rather by component name. Similarly, indiviudal components may also be replaced
hm2$weights
## auto.arima ets nnetar stlm tbats
## 0.06178208 0.06420160 0.64672304 0.08089440 0.14639887
newWeights <- c(0.1, 0.2, 0.3, 0.1, 0.3)
names(newWeights) <- c("auto.arima", "ets", "nnetar", "stlm", "tbats")
hm2$weights <- newWeights
hm2
## Hybrid forecast model comprised of the following models: auto.arima, ets, nnetar, stlm, tbats
## ############
## auto.arima with weight 0.1
## ############
## ets with weight 0.2
## ############
## nnetar with weight 0.3
## ############
## stlm with weight 0.1
## ############
## tbats with weight 0.3
hm2$weights[1] <- 0.2
hm2$weights[2] <- 0.1
hm2
## Hybrid forecast model comprised of the following models: auto.arima, ets, nnetar, stlm, tbats
## ############
## auto.arima with weight 0.2
## ############
## ets with weight 0.1
## ############
## nnetar with weight 0.3
## ############
## stlm with weight 0.1
## ############
## tbats with weight 0.3
This hybridModel
S3 object can be manipulated with the same familiar interface from the “forecast” package, including S3 generic functions such as accuracy
, forecast
, fitted
, and residuals
.
# View the first 10 fitted values and residuals
head(fitted(hm1))
## [1] 1618.244 1690.320 1803.662 1857.580 2162.098 2325.483
head(residuals(hm1))
## [1] 1618.244 1690.320 1803.662 1857.580 2162.098 2325.483
In-sample errors and various accuracy measure can be extracted with the accuracy
method. The “forecastHybrid” package creates an S3 generic from the accuracy
method in the “forecast” package, so accuracy
will continue to function as normal with objects from the “forecast” package, but now special functionality is created for hybridModel
objects. To view the in-sample accuracy for the entire ensemble, a simple call can be made.
accuracy(hm1)
## ME RMSE MAE MPE MAPE ACF1
## Test set 56.50349 1440.361 782.411 0.3720235 3.437136 -0.09356637
## Theil's U
## Test set 0.474891
In addition to retrieving the ensemble’s accuracy, the individual component models’ accuracies can be easily viewed by using the individual = TRUE
argument.
accuracy(hm1, individual = TRUE)
## $auto.arima
## ME RMSE MAE MPE MAPE MASE
## Training set 151.1913 1779.854 1005.769 0.8861332 4.446548 0.5391395
## ACF1
## Training set -0.002784589
##
## $ets
## ME RMSE MAE MPE MAPE MASE
## Training set 41.67757 1451.139 788.3641 0.2856501 3.54687 0.4226001
## ACF1
## Training set -0.1370769
##
## $tbats
## ME RMSE MAE MPE MAPE MASE
## Training set -23.35843 1459.592 795.7255 -0.05571283 3.501653 0.4608233
## ACF1
## Training set -0.07452428
Now’s let’s forecast future values. The forecast()
function produce an S3 class forecast
object for the next 48 periods from the ensemble model.
hForecast <- forecast(hm1, h = 48)
Now plot the forecast for the next 48 periods. The prediction intervals are preserved from the individual component models and currently use the most extreme value from an individual model, producing a conservative estimate for the ensemble’s performance.
plot(hForecast)
The package aims to make fitting ensembles easy and quick, but it still allows advanced tuning of all the parameters available in the “forecast” package. This is possible through usage of the a.args
, e.args
, n.args
, s.args
, and t.args
lists. These optional list arguments may be applied to one, none, all, or any combination of the included individual component models. Consult the documentation in the “forecast” package for acceptable arguments to pass in the auto.arima
, ets
, nnetar
, stlm
, and tbats
functions.
hm2 <- hybridModel(y = gas, models = "aefnst",
a.args = list(max.p = 12, max.q = 12, approximation = FALSE),
n.args = list(repeats = 50),
s.args = list(robust = TRUE),
t.args = list(use.arma.errors = FALSE))
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the thetam model
## Fitting the nnetar model
## Fitting the stlm model
## Fitting the tbats model
Since the lambda
argument is shared between most of the models in the “forecast” framework, it is included as a special paramemeter that can be used to set the Box-Cox transform in all models instead of settings this individually. For example,
hm3 <- hybridModel(y = wineind, models = "ae", lambda = 0.15)
## Fitting the auto.arima model
## Fitting the ets model
hm3$auto.arima$lambda
## [1] 0.15
hm3$ets$lambda
## [1] 0.15
Users can still apply the lambda
argument through the tuning lists, but in this case the list-supplied argument overwrites the default used across all models. Compare the following two results.
hm4 <- hybridModel(y = wineind, models = "aens", lambda = 0.2,
a.args = list(lambda = 0.5),
n.args = list(lambda = 0.6))
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the nnetar model
## Fitting the stlm model
hm4$auto.arima$lambda
## [1] 0.5
hm4$ets$lambda
## [1] 0.2
hm4$nnetar$lambda
## [1] 0.6
hm4$stlm$lambda
## [1] 0.2
Note that lambda has no impact on thetam
models, and that there is no f.args
argument to provide parguments to thetam
. Following forecast::thetaf
on which thetam
is based, there are no such arguments; it always runs with the defaults.
Covariates can also be supplied to auto.arima
and nnetar
models as is done in the “forecast” package. To do this, utilize the a.args
and n.args
lists. Note that the xreg
may also be passed to a stlm
model, but only when method = "arima"
instead of the default method = "ets"
. Unlike the usage in the “forecast” package, the xreg
argument should be passed as a dataframe, not a matrix. The stlm
models require that the input series will be seasonal, so in the example below we will convert the input data to a ts
object. If a xreg
is used in training, it must also be supplied to the forecast()
function in the xreg
argument. Note that if the number of rows in the xreg
to be used for the forecast does not match the supplied h
forecast horizon, the function will overwrite h
with the number of rows in xreg
and issue a warning.
# Use the beaver1 dataset with the variable "activ" as a covariate and "temp" as the timeseries
# Divice this into a train and test set
trainSet <- beaver1[1:100, ]
testSet <- beaver1[-(1:100), ]
trainXreg <- data.frame(trainSet$activ)
testXreg <- data.frame(testSet$activ)
# Create the model
beaverhm <- hybridModel(ts(trainSet$temp, f = 6),
models = "aenst",
a.args = list(xreg = trainXreg),
n.args = list(xreg = trainXreg),
s.args = list(xreg = trainXreg, method = "arima"))
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the nnetar model
## Fitting the stlm model
## Fitting the tbats model
# Forecast future values
#forecast won't run in current version, next release allows
#forecast with stlm with xreg
#beaverfc <- forecast(beaverhm, xreg = testXreg)
# View the accuracy of the model
#accuracy(beaverfc, testSet$temp)