This short tutorial gives an example of how one can statistically assess whether a market is in an equilibrium state. The tutorial assumes some familiarity with the concepts and the functionality of the package. The basic_usage vignette can be helpful in acquiring this familiarity.
Load the required libraries.
Prepare the data. Here, we simply simulate data using a data generating process for a market in equilibrium.
nobs <- 1000
tobs <- 5
alpha_d <- -3.9
beta_d0 <- 18.9
beta_d <- c(2.1, -0.7)
eta_d <- c(3.5, 6.25)
alpha_s <- 2.8
beta_s0 <- 3.2
beta_s <- c(2.65)
eta_s <- c(1.15, 4.2)
sigma_d <- 0.8
sigma_s <- 1.1
rho_ds <- 0.0
seed <- 42
eq_data <- simulate_data(
"equilibrium_model", nobs, tobs,
alpha_d, beta_d0, beta_d, eta_d,
alpha_s, beta_s0, beta_s, eta_s,
NA, NA, c(NA),
sigma_d = sigma_d, sigma_s = sigma_s, rho_ds = rho_ds,
seed = seed
)
Prepare the basic parameters for model initialization.
verbose <- 2
correlated_shocks <- TRUE
formula <- Q | P | id | date ~ P + Xd1 + Xd2 + X1 + X2 | P + Xs1 + X1 + X2
Set the estimation parameters.
Using the above parameterization, construct and estimate the model objects. Here we estimate two equilibrium models and four disequilibrium models. All the models are constructed using the simulated data from a model of market in equilibrium.
eqmdl_reg <- equilibrium_model(
formula, eq_data[eq_data$date != 1, ],
correlated_shocks = correlated_shocks, verbose = verbose,
estimation_options = list(method = "2SLS")
)
#> Info: This is 'Equilibrium with correlated shocks' model
#> Warning: Removing unobserved '1' level(s).
eqmdl_fit <- equilibrium_model(
formula, eq_data[eq_data$date != 1, ],
correlated_shocks = correlated_shocks, verbose = verbose,
estimation_options = list(
control = optimization_options, method = optimization_method
)
)
#> Info: This is 'Equilibrium with correlated shocks' model
#> Warning: Removing unobserved '1' level(s).
bsmdl_fit <- diseq_basic(
formula, eq_data[eq_data$date != 1, ],
correlated_shocks = correlated_shocks, verbose = verbose,
estimation_options = list(
control = optimization_options, method = optimization_method
)
)
#> Info: This is 'Basic with correlated shocks' model
#> Warning: Removing unobserved '1' level(s).
damdl_fit <- diseq_deterministic_adjustment(
formula, eq_data,
correlated_shocks = correlated_shocks, verbose = verbose,
estimation_options = list(
control = optimization_options, method = optimization_method
)
)
#> Info: This is 'Deterministic Adjustment with correlated shocks' model
#> Info: Dropping 1000 rows by generating 'LAGGED_P'.
#> Info: Sample separated with 1971 rows in excess supply and 2029 in excess demand state.
All the models provide estimates for the simulated data. Even with simulated data, it is difficult to assess which model performs better by examining only the summaries in separation or collectively.
summary(eqmdl_reg@fit[[1]]$first_stage_model)
#>
#> Call:
#> lm(formula = first_stage_formula, data = object@model_tibble)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.88388 -0.14215 0.00305 0.14088 0.81967
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 2.356218 0.036603 64.37 <2e-16 ***
#> Xd1 0.314532 0.006576 47.83 <2e-16 ***
#> Xd2 -0.102046 0.006549 -15.58 <2e-16 ***
#> X1 0.347118 0.006523 53.21 <2e-16 ***
#> X2 0.306744 0.006445 47.60 <2e-16 ***
#> Xs1 -0.401540 0.006568 -61.14 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.2087 on 3994 degrees of freedom
#> Multiple R-squared: 0.7409, Adjusted R-squared: 0.7406
#> F-statistic: 2284 on 5 and 3994 DF, p-value: < 2.2e-16
summary(eqmdl_reg)
#> Equilibrium Model for Markets in Equilibrium
#> Demand RHS : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#> Supply RHS : S_P + S_Xs1 + S_X1 + S_X2
#> Market Clearing : Q = D_Q = S_Q
#> Shocks : Correlated
#> Nobs : 4000
#> Sample Separation : Not Separated
#> Quantity Var : Q
#> Price Var : P
#> Key Var(s) : id, date
#> Time Var : date
#>
#> systemfit results
#> method: 2SLS
#>
#> N DF SSR detRCov OLS-R2 McElroy-R2
#> system 8000 7989 7765.49 0.838051 0.895447 0.90887
#>
#> N DF SSR MSE RMSE R2 Adj R2
#> demand 4000 3994 2579.28 0.645789 0.80361 0.930546 0.930459
#> supply 4000 3995 5186.20 1.298174 1.13937 0.860348 0.860208
#>
#> The covariance matrix of the residuals
#> demand supply
#> demand 0.6457891 -0.0171846
#> supply -0.0171846 1.2981739
#>
#> The correlations of the residuals
#> demand supply
#> demand 1.0000000 -0.0187684
#> supply -0.0187684 1.0000000
#>
#>
#> 2SLS estimates for 'demand' (equation 1)
#> Model Formula: Q ~ P + Xd1 + Xd2 + X1 + X2
#> <environment: 0x56498eb4e920>
#> Instruments: ~Xd1 + Xd2 + X1 + X2 + Xs1
#> <environment: 0x56498eb4e920>
#>
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 18.7291651 0.1529666 122.4395 < 2.22e-16 ***
#> P -3.8534666 0.0629681 -61.1971 < 2.22e-16 ***
#> Xd1 2.1102794 0.0323478 65.2372 < 2.22e-16 ***
#> Xd2 -0.7156455 0.0261205 -27.3978 < 2.22e-16 ***
#> X1 3.5151555 0.0331078 106.1731 < 2.22e-16 ***
#> X2 6.2431905 0.0314974 198.2126 < 2.22e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.80361 on 3994 degrees of freedom
#> Number of observations: 4000 Degrees of Freedom: 3994
#> SSR: 2579.281574 MSE: 0.645789 Root MSE: 0.80361
#> Multiple R-Squared: 0.930546 Adjusted R-Squared: 0.930459
#>
#>
#> 2SLS estimates for 'supply' (equation 2)
#> Model Formula: Q ~ P + Xs1 + X1 + X2
#> <environment: 0x56498eb4e920>
#> Instruments: ~Xd1 + Xd2 + X1 + X2 + Xs1
#> <environment: 0x56498eb4e920>
#>
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 2.7596055 0.3499559 7.88558 3.9968e-15 ***
#> P 2.8849781 0.1084928 26.59143 < 2.22e-16 ***
#> Xs1 2.7054083 0.0569035 47.54382 < 2.22e-16 ***
#> X1 1.1758638 0.0515231 22.82207 < 2.22e-16 ***
#> X2 4.1766642 0.0489108 85.39346 < 2.22e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 1.139374 on 3995 degrees of freedom
#> Number of observations: 4000 Degrees of Freedom: 3995
#> SSR: 5186.204766 MSE: 1.298174 Root MSE: 1.139374
#> Multiple R-Squared: 0.860348 Adjusted R-Squared: 0.860208
summary(eqmdl_fit)
#> Equilibrium Model for Markets in Equilibrium
#> Demand RHS : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#> Supply RHS : S_P + S_Xs1 + S_X1 + S_X2
#> Market Clearing : Q = D_Q = S_Q
#> Shocks : Correlated
#> Nobs : 4000
#> Sample Separation : Not Separated
#> Quantity Var : Q
#> Price Var : P
#> Key Var(s) : id, date
#> Time Var : date
#>
#> Maximum likelihood estimation
#> Method : BFGS
#> Max Iterations : 10000
#> Relative Tolerance : 1e-08
#> Convergence Status : success
#> Starting Values :
#> D_P D_CONST D_Xd1 D_Xd2 D_X1 D_X2 S_P
#> -2.6871 17.1339 1.7372 -0.5889 3.1155 5.8838 0.1426
#> S_CONST S_Xs1 S_X1 S_X2 D_VARIANCE S_VARIANCE RHO
#> 10.6921 1.5882 2.1173 5.0359 1.0000 1.0000 0.0000
#>
#> Coefficients
#> Estimate Std. Error z value Pr(z)
#> D_P -3.85348 0.06293 -61.2353 0.000e+00
#> D_CONST 18.73053 0.15336 122.1322 0.000e+00
#> D_Xd1 2.11014 0.03227 65.3909 0.000e+00
#> D_Xd2 -0.71602 0.02617 -27.3612 7.944e-165
#> D_X1 3.51516 0.03309 106.2424 0.000e+00
#> D_X2 6.24318 0.03147 198.3687 0.000e+00
#> S_P 2.88624 0.10846 26.6109 5.080e-156
#> S_CONST 2.75586 0.34985 7.8774 3.344e-15
#> S_Xs1 2.70593 0.05688 47.5696 0.000e+00
#> S_X1 1.17544 0.05150 22.8224 2.746e-115
#> S_X2 4.17628 0.04889 85.4181 0.000e+00
#> D_VARIANCE 0.64482 0.01899 33.9485 1.283e-252
#> S_VARIANCE 1.29703 0.05129 25.2886 4.267e-141
#> RHO -0.01891 0.02353 -0.8034 4.217e-01
#>
#> -2 log L: 6722.67
summary(bsmdl_fit)
#> Basic Model for Markets in Disequilibrium
#> Demand RHS : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#> Supply RHS : S_P + S_Xs1 + S_X1 + S_X2
#> Short Side Rule : Q = min(D_Q, S_Q)
#> Shocks : Correlated
#> Nobs : 4000
#> Sample Separation : Not Separated
#> Quantity Var : Q
#> Price Var : P
#> Key Var(s) : id, date
#> Time Var : date
#>
#> Maximum likelihood estimation
#> Method : BFGS
#> Max Iterations : 10000
#> Relative Tolerance : 1e-08
#> Convergence Status : success
#> Starting Values :
#> D_P D_CONST D_Xd1 D_Xd2 D_X1 D_X2 S_P
#> -2.6871 17.1339 1.7372 -0.5889 3.1155 5.8838 0.1426
#> S_CONST S_Xs1 S_X1 S_X2 D_VARIANCE S_VARIANCE RHO
#> 10.6921 1.5882 2.1173 5.0359 1.0000 1.0000 0.0000
#>
#> Coefficients
#> Estimate Std. Error z value Pr(z)
#> D_P -2.9119 0.07238 -40.233 0.000e+00
#> D_CONST 17.4679 0.21722 80.415 0.000e+00
#> D_Xd1 2.0264 0.04817 42.068 0.000e+00
#> D_Xd2 -0.7116 0.03279 -21.703 1.925e-104
#> D_X1 3.1680 0.04761 66.547 0.000e+00
#> D_X2 5.9840 0.04698 127.382 0.000e+00
#> S_P 0.9242 0.15942 5.797 6.755e-09
#> S_CONST 8.3532 0.56314 14.833 8.923e-50
#> S_Xs1 2.4675 0.11583 21.302 1.077e-100
#> S_X1 1.8785 0.09651 19.464 2.227e-84
#> S_X2 4.7120 0.10011 47.071 0.000e+00
#> D_VARIANCE 0.5960 0.02391 24.927 3.756e-137
#> S_VARIANCE 0.9703 0.07354 13.194 9.468e-40
#> RHO -0.3355 0.05259 -6.379 1.788e-10
#>
#> -2 log L: 8182.64
summary(damdl_fit)
#> Deterministic Adjustment Model for Markets in Disequilibrium
#> Demand RHS : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#> Supply RHS : S_P + S_Xs1 + S_X1 + S_X2
#> Short Side Rule : Q = min(D_Q, S_Q)
#> Separation Rule : P_DIFF analogous to (D_Q - S_Q)
#> Shocks : Correlated
#> Nobs : 4000
#> Sample Separation : Demand Obs = 1971, Supply Obs = 2029
#> Quantity Var : Q
#> Price Var : P
#> Key Var(s) : id, date
#> Time Var : date
#>
#> Maximum likelihood estimation
#> Method : BFGS
#> Max Iterations : 10000
#> Relative Tolerance : 1e-08
#> Convergence Status : success
#> Starting Values :
#> D_P D_CONST D_Xd1 D_Xd2 D_X1 D_X2 S_P
#> -2.6871 17.1339 1.7372 -0.5889 3.1155 5.8838 0.1426
#> S_CONST S_Xs1 S_X1 S_X2 P_DIFF D_VARIANCE S_VARIANCE
#> 10.6921 1.5882 2.1173 5.0359 1.0000 1.0000 1.0000
#> RHO
#> 0.0000
#>
#> Coefficients
#> Estimate Std. Error z value Pr(z)
#> D_P -3.848967 0.06636 -58.0023 0.000e+00
#> D_CONST 18.717272 0.16531 113.2287 0.000e+00
#> D_Xd1 2.110049 0.03227 65.3876 0.000e+00
#> D_Xd2 -0.716016 0.02617 -27.3624 7.681e-165
#> D_X1 3.515018 0.03309 106.2122 0.000e+00
#> D_X2 6.243126 0.03147 198.3693 0.000e+00
#> S_P 2.882241 0.11006 26.1867 3.763e-151
#> S_CONST 2.772435 0.35836 7.7364 1.022e-14
#> S_Xs1 2.705982 0.05689 47.5691 0.000e+00
#> S_X1 1.175343 0.05151 22.8185 3.001e-115
#> S_X2 4.176110 0.04890 85.4028 0.000e+00
#> P_DIFF 0.008358 0.03902 0.2142 8.304e-01
#> D_VARIANCE 0.644759 0.01900 33.9428 1.556e-252
#> S_VARIANCE 1.297088 0.05129 25.2875 4.389e-141
#> RHO -0.018885 0.02353 -0.8025 4.223e-01
#>
#> -2 log L: 6722.63
The deterministic adjustment model has price dynamics that are analogous to excess demand and estimates one extra parameter. The directional model estimates one parameter less as the model does not have enough equations to identify prices in both demand and supply equations. The estimated parameters are summarized as follows.
sim_coef <- c(
alpha_d, beta_d0, beta_d, eta_d,
alpha_s, beta_s0, beta_s, eta_s,
NA,
sigma_d, sigma_s,
rho_ds
)
names(sim_coef) <- names(coef(damdl_fit))
dm_inc <- coef(eqmdl_reg)[
grep("demand", names(coef(eqmdl_reg)))
]
sp_inc <- coef(eqmdl_reg)[
grep("supply", names(coef(eqmdl_reg)))
]
lm_coef <- c(
dm_inc[2], dm_inc[-2], sp_inc[2], sp_inc[-2],
NA,
NA, NA,
NA
)
eqmdl_coef <- append(
coef(eqmdl_fit), c(NA),
after = which(names(coef(eqmdl_fit)) ==
prefixed_variance_variable(eqmdl_fit@system@demand)) - 1
)
bsmdl_coef <- append(
coef(bsmdl_fit), c(NA),
after = which(names(coef(bsmdl_fit)) ==
prefixed_variance_variable(bsmdl_fit@system@demand)) - 1
)
damdl_coef <- coef(damdl_fit)
comp <- tibble::tibble(
parameter = names(sim_coef),
sim = sim_coef, lm = lm_coef, fi = eqmdl_coef,
bm = bsmdl_coef, da = damdl_coef,
lmerr = abs(lm_coef - sim_coef), fierr = abs(eqmdl_coef - sim_coef),
bmerr = abs(bsmdl_coef - sim_coef), daerr = abs(damdl_coef - sim_coef)
)
comp
#> # A tibble: 15 × 10
#> parameter sim lm fi bm da lmerr fierr bmerr
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 D_P -3.9 -3.85 -3.85 -2.91 -3.85 0.0465 0.0465 0.988
#> 2 D_CONST 18.9 18.7 18.7 17.5 18.7 0.171 0.169 1.43
#> 3 D_Xd1 2.1 2.11 2.11 2.03 2.11 0.0103 0.0101 0.0736
#> 4 D_Xd2 -0.7 -0.716 -0.716 -0.712 -0.716 0.0156 0.0160 0.0116
#> 5 D_X1 3.5 3.52 3.52 3.17 3.52 0.0152 0.0152 0.332
#> 6 D_X2 6.25 6.24 6.24 5.98 6.24 0.00681 0.00682 0.266
#> 7 S_P 2.8 2.88 2.89 0.924 2.88 0.0850 0.0862 1.88
#> 8 S_CONST 3.2 2.76 2.76 8.35 2.77 0.440 0.444 5.15
#> 9 S_Xs1 2.65 2.71 2.71 2.47 2.71 0.0554 0.0559 0.182
#> 10 S_X1 1.15 1.18 1.18 1.88 1.18 0.0259 0.0254 0.729
#> 11 S_X2 4.2 4.18 4.18 4.71 4.18 0.0233 0.0237 0.512
#> 12 P_DIFF NA NA NA NA 0.00836 NA NA NA
#> 13 D_VARIANCE 0.8 NA 0.645 0.596 0.645 NA 0.155 0.204
#> 14 S_VARIANCE 1.1 NA 1.30 0.970 1.30 NA 0.197 0.130
#> 15 RHO 0 NA -0.0189 -0.335 -0.0189 NA 0.0189 0.335
#> # … with 1 more variable: daerr <dbl>
Since we have used simulated data, we can calculate the average absolute error of the parameter estimation for each of the models. In practice, the population values are unknown and this calculation is impossible.
comp_means <- colMeans(comp[, grep("err", colnames(comp))], na.rm = TRUE)
comp_means
#> lmerr fierr bmerr daerr
#> 0.08138534 0.09076426 0.87318787 0.09056802
Moreover, the average absolute error cannot provide an overall estimation assessment as the market models have different parameter spaces. To assess the overall model performance one can instead use an information criterion.
model_names <- c(
eqmdl_fit@model_type_string,
bsmdl_fit@model_type_string, damdl_fit@model_type_string
)
model_obs <- c(nobs(eqmdl_fit), nobs(bsmdl_fit), nobs(damdl_fit))
model_errors <- c(
comp_means["fierr"],
comp_means["bmerr"],
comp_means["daerr"]
)
seltbl <- AIC(eqmdl_fit, bsmdl_fit, damdl_fit) %>%
tibble::add_column(Model = model_names, .before = 1) %>%
tibble::add_column(Obs. = model_obs, `Mean Error` = model_errors) %>%
dplyr::rename(D.F. = df) %>%
dplyr::arrange(AIC)
seltbl
#> Model D.F. AIC Obs. Mean Error
#> eqmdl_fit Equilibrium 14 6750.671 4000 0.09076426
#> damdl_fit Deterministic Adjustment 15 6752.625 4000 0.09056802
#> bsmdl_fit Basic 14 8210.640 4000 0.87318787