--- title: "Case Study: Bookshop Orders" author: "Vladimír Holý & Petra Tomanová" date: "2024-02-01" bibliography: library.bib link-citations: yes output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Case Study: Bookshop Orders} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Introduction We loosely follow @Tomanova2021 and analyze the timing of orders from a Czech antiquarian bookshop. Besides seasonality and diurnal patterns, one would expect the times of orders to be independent of each other. However, this is not the case and we use a GAS model to capture dependence between the times of orders. A strand of financial econometrics is devoted to analyzing the timing of transactions by the so-called autoregressive conditional duration (ACD) model introduced by @Engle1998. For a textbook treatment of such financial point processes, see e.g. @Hautsch2012. ## Data Preparation Let us prepare the analyzed data. We use the `bookshop_orders` dataset containing times of orders from June 8, 2018 to December 20, 2018. The differences of subsequent times, i.e. durations, are already included in the dataset. Additionally, the dataset includes durations that have been adjusted for diurnal patterns using smoothing splines. This is the time series we are interested in. ```r library("gasmodel") data("bookshop_orders") y <- bookshop_orders$duration_adj[-1] ``` ## Model Estimation The following distributions are available for our data type. We utilize the generalized gamma family. ```r distr(filter_type = "duration", filter_dim = "uni") #> distr_title param_title distr param type dim orthog default #> 6 Birnbaum-Saunders Scale bisa scale duration uni TRUE TRUE #> 7 Burr Scale burr scale duration uni FALSE TRUE #> 11 Exponential Rate exp rate duration uni TRUE FALSE #> 12 Exponential Scale exp scale duration uni TRUE TRUE #> 13 Exponential-Logarithmic Rate explog rate duration uni FALSE TRUE #> 14 Fisk Scale fisk scale duration uni TRUE TRUE #> 15 Gamma Rate gamma rate duration uni FALSE FALSE #> 16 Gamma Scale gamma scale duration uni FALSE TRUE #> 17 Generalized Gamma Rate gengamma rate duration uni FALSE FALSE #> 18 Generalized Gamma Scale gengamma scale duration uni FALSE TRUE #> 25 Log-Normal Log-Mean-Variance lognorm logmeanvar duration uni TRUE TRUE #> 26 Lomax Scale lomax scale duration uni FALSE TRUE #> 34 Rayleigh Scale rayleigh scale duration uni TRUE TRUE #> 40 Weibull Rate weibull rate duration uni FALSE FALSE #> 41 Weibull Scale weibull scale duration uni FALSE TRUE ``` First, we estimate the model based on the exponential distribution. By default, the logarithmic link for the time-varying scale parameter is adopted. In this particular case, the Fisher information is constant and the three scalings are therefore equivalent. ```r est_exp <- gas(y = y, distr = "exp") est_exp #> GAS Model: Exponential Distribution / Scale Parametrization / Unit Scaling #> #> Coefficients: #> Estimate Std. Error Z-Test Pr(>|Z|) #> log(scale)_omega -0.00089754 0.00117598 -0.7632 0.4453 #> log(scale)_alpha1 0.04992815 0.00657547 7.5931 3.123e-14 *** #> log(scale)_phi1 0.96278385 0.00918996 104.7647 < 2.2e-16 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Log-Likelihood: -5571.078, AIC: 11148.16, BIC: 11168.11 ``` Second, we estimate the model based on the Weibull distribution. Compared to the exponential distribution, it has an additional shape parameter. By default, the first parameter is assumed time-varying while the remaining are assumed static. In our case, the model features the time-varying scale parameter with the constant shape parameter. However, it is possible to modify this behavior using the `par_static` argument. ```r est_weibull <- gas(y = y, distr = "weibull") est_weibull #> GAS Model: Weibull Distribution / Scale Parametrization / Unit Scaling #> #> Coefficients: #> Estimate Std. Error Z-Test Pr(>|Z|) #> log(scale)_omega -0.0019173 0.0013710 -1.3985 0.162 #> log(scale)_alpha1 0.0569780 0.0081800 6.9655 3.272e-12 *** #> log(scale)_phi1 0.9617316 0.0102214 94.0896 < 2.2e-16 *** #> shape 0.9472091 0.0094738 99.9819 < 2.2e-16 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Log-Likelihood: -5555.903, AIC: 11119.81, BIC: 11146.41 ``` Third, we estimate the model based on the gamma distribution. This is another generalization of the exponential distribution with an additional shape parameter. ```r est_gamma <- gas(y = y, distr = "gamma") est_gamma #> GAS Model: Gamma Distribution / Scale Parametrization / Unit Scaling #> #> Coefficients: #> Estimate Std. Error Z-Test Pr(>|Z|) #> log(scale)_omega 0.0010440 0.0013489 0.7740 0.4389 #> log(scale)_alpha1 0.0526020 0.0071647 7.3418 2.107e-13 *** #> log(scale)_phi1 0.9627838 0.0094368 102.0247 < 2.2e-16 *** #> shape 0.9491683 0.0155575 61.0102 < 2.2e-16 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Log-Likelihood: -5565.939, AIC: 11139.88, BIC: 11166.48 ``` Fourth, we estimate the model based on the generalized gamma distribution. The generalized gamma distribution encompasses all three aforementioned distributions as special cases. ```r est_gengamma <- gas(y = y, distr = "gengamma") est_gengamma #> GAS Model: Generalized Gamma Distribution / Scale Parametrization / Unit Scaling #> #> Coefficients: #> Estimate Std. Error Z-Test Pr(>|Z|) #> log(scale)_omega -0.057636 0.021624 -2.6653 0.007691 ** #> log(scale)_alpha1 0.071908 0.011810 6.0889 1.137e-09 *** #> log(scale)_phi1 0.950375 0.015152 62.7220 < 2.2e-16 *** #> shape1 1.886317 0.168357 11.2043 < 2.2e-16 *** #> shape2 0.660542 0.033779 19.5546 < 2.2e-16 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Log-Likelihood: -5521.126, AIC: 11052.25, BIC: 11085.51 ``` By comparing the Akaike information criterion (AIC), we find that the most general model, i.e. the one based on the generalized gamma distribution, is the most suitable. For this purpose, we use generic function `AIC()`. Alternatively, the AIC of an estimated model is stored in `est_gengamma$fit$aic`. ```r AIC(est_exp, est_weibull, est_gamma, est_gengamma) #> df AIC #> est_exp 3 11148.16 #> est_weibull 4 11119.81 #> est_gamma 4 11139.88 #> est_gengamma 5 11052.25 ``` Let us take a look on the time-varying parameters of the generalized gamma model. ```r plot(est_gengamma) ```
Time-varying parameters based on the generalized gamma model.
Time-varying parameters based on the generalized gamma model with trend.
Boxplot of bootstrapped coefficients based on the generalized gamma model with trend.
Simulated time series based on the generalized gamma model with trend.