Introduction

This vignette presents the R package RationalExp associated to @DGM (DGM hereafter). This package implements a test of the rational expectations hypothesis based on the marginal distributions of realizations and subjective beliefs. This test can be used in cases where realizations and subjective beliefs are observed in two different datasets that cannot be matched, or when they are observed in the same dataset. The test can be implemented with covariates and survey weights. The package also computes the estimator of the minimal deviations from rational expectations than can be rationalized by the data.

How to get started

The R package RationalExp can be downloaded from Github https://github.com/cgaillac/RationalExp. To install the RationalExp package from Github, the devtools library is required. We then use the command:

# library("devtools")
# install_github('cgaillac/RationalExp')

The most current version of the package (development version) is maintained only on Github. Provided that one has a proper internet connection and a write permission on the appropriate system directories, the installation of the package should proceed automatically. If the direct download and installation does not work, one can install the RationalExp package by simply downloading the source package, and then typing:

#install.packages("personal_directory/RationalExp_1.1.tar.gz", repos=NULL)

Once the RationalExp package is installed, it can be loaded to the current R session by the command library(RationalExp).

Online help is available in two ways: either help(package="RationalExp") or ?test. The first gives an overview over the available commands in the package. The second gives detailed information about a specific command. A valuable feature of R help files is that the examples used to illustrate commands are executable, so they can be pasted into an R session or run as a group with a command like example(test).

library(RationalExp)

Theory

Testing rational expectations

We first explain the test procedure proposed in DGM for the test with covariates: \begin{align} \text{H}_{0X}: & \text{ there exists a pair of random variables } (Y',\psi') \text{ and a sigma-algebra } \mathcal{I}' \ & \text{ such that } \sigma(\psi', X) \subset \mathcal{I}', \ Y'|X\sim Y|X , \, \psi'|X \sim \psi|X \text{ and } \mathbb{E}\left[ Y'|\mathcal{I}'\right]=\psi'. \end{align}

To simplify notation, we use as in DGM a potential outcome framework to describe our data combination problem. Specifically, instead of observing \((Y,\psi)\), we suppose to observe only the covariates \(X\), \(\widetilde{Y} = DY + (1-D)\psi\) and \(D\), where \(D=1\) (resp. \(D=0\)) if the unit belongs to the dataset of \(Y\) (resp. \(\psi\)). We assume that the two samples are drawn from the same population, which amounts to supposing that \(D\perp\!\!\!\perp (X, Y,\psi)\). In order to build DGM's test, DGM use the following characterization of \(\text{H}_{0X}\): \[ \mathbb{E}\left[ \left(y-Y\right)^+ - \left(y-\psi\right)^+\middle| X \right] \geq 0 \quad \forall y\in \mathbb{R} \text{ and} \quad \mathbb{E}\left[ Y - \psi\middle| X\right] =0,\] where \(u^+ =\max(0,u)\). Equivalently but written with \(\widetilde{Y}\) only, \[\mathbb{E}\left[ W \left(y-\widetilde{Y}\right)^+\middle|X\right]\geq 0 \quad \forall y\in \mathbb{R} \text{ and} \quad \mathbb{E}\left[W\widetilde{Y}\middle|X\right]= 0,\] where \(W=D/\mathbb{E}(D)-(1-D)/\mathbb{E}(1-D)\). This formulation of the null hypothesis allows one to apply the instrumental functions approach of @andrews2016inference (AS hereafter), who consider the issue of testing many conditional moment inequalities and equalities. The initial step is to transform the conditional moments into the following unconditional moments conditions: \[ \mathbb{E}\left[W\left(y-\widetilde{Y}\right)^+ h(X)\right] \geq 0, \quad \mathbb{E}\left[\left( Y - \psi\right)h(X)\right] =0 .\] for all \(y\in \mathbb{R}\) and \(h\geq 0\) belonging to a suitable class of functions.

\medskip We suppose to observe here a sample \((D_i, X_i, \widetilde{Y}_i)_{i=1…n}\) of \(n\) i.i.d. copies of \((D,X,\widetilde{Y})\). For notational convenience, we let \(\widetilde{X}_i\) denote the nontransformed vector of covariates and redefine \(X_i\) as the transformed vector in the following way: \[X_i=\Phi_0\left( \widehat{\Sigma}_{ \widetilde{X},n}^{-½}\left( \widetilde{X}_{i} - \overline{ \widetilde{X}}_{i}\right)\right),\]
where, for any \(x=(x_1,\dots,x_{d_X})\), we let \(\Phi_0(x)=\left( \Phi(x_1) , \dots, \Phi\left(x_{d_X} \right)\right)^{\top}\). Here \(\Phi\) denotes the standard normal cdf, \(\widehat{\Sigma}_{\widetilde{X},n}\) is the sample covariance matrix of \(\left(\widetilde{X}_i\right)_{i=1…n}\) and \(\overline{\widetilde{X}}_n\) its sample mean.

\medskip Now that \(X_i \in [0,1]^{d_X}\), we consider \(h\) functions that are indicators of belonging to specific hypercubes within \([0,1]^{d_X}\). Namely, we consider the class of functions \(\mathcal{H}_r= \left\{h_{a,r}, \; a\in A_r\right\}\), with \(A_r= \left\{1,2,\dots, 2r\right\}^{d_X}\) (\(r\geq 1\)), \(h_{a,r}(x) = 1\left\{x \in C_{a,r}\right\}\) and, for any \(a=(a_1,…,a_{d_X})^{\top}\in A_r\), \[C_{a,r} = \prod_{u=1}^{d_X} \left(\frac{a_u -1 }{2r}, \frac{a_u }{2r}\right].\]

\medskip To define the test statistic \(T\), we need to introduce additional notation. First, we define, for any given \(y\), \begin{eqnarray}\label{eq:mn} m\left(Di,\widetilde{Y}_i, X_i, h,y\right) =\left( \begin{array}{c} m{1}\left(Di,\widetilde{Y}_i, X_i, h,y\right) \ m{2}\left(Di,\widetilde{Y}_i, X_i, h,y\right) \end{array} \right) = \left( \begin{array}{c} w_i\left(y - \widetilde{Y}_i\right)+h\left(X_i\right) \ w_i\widetilde{Y}_i h\left(X_i\right) \end{array} \right), \end{eqnarray} where $w_i=nD_i/\sum{j=1}n Dj - n(1-D_i)/\sum{j=1}n( 1-Dj)$. Let $\overline{m}_n (h,y)= \sum{i=1}{n} m\left(Di,\widetilde{Y}_i, X_i, h,y\right)/n$ and define similarly $\overline{m}{n,j}$ for \(j=1,2\). For any function \(h\) and any \(y\in\mathbb{R}\), let us also define, for some \(\epsilon>0\), \[ \overline{\Sigma}_n(h,y) = \widehat{\Sigma}_n(h,y) + \epsilon \mathrm{Diag}\left( \widehat{\mathbb{V}}\left(\widetilde{Y} \right) , \widehat{\mathbb{V}}\left(\widetilde{Y} \right) \right),\] where \(\widehat{\Sigma}_n(h,y)\) is the sample covariance matrix of \(\sqrt{n}\overline{m}_n\left( h,y\right)\) and \(\widehat{\mathbb{V}}\left(\widetilde{Y}\right)\) is the empirical variance of \(\widetilde{Y}\). We then denote by \(\overline{\Sigma}_{n,jj}(h,y) (j=1,2)\) the \(j\)-th diagonal term of \(\overline{\Sigma}_{n}(h,y)\).

\medskip Then the (Cram\'{e}r-von-Mises) test statistic \(T\) is defined by: \[ T= \sup_{y \in \widehat{\mathcal{Y}}} \sum_{r=1}^{r_n}\frac{(2r)^{-d_X}}{\left(r^2 + 100\right) } \sum_{a\in A_r} \left( \left(1-p\right)\left( - \frac{\sqrt{n} \overline{m}_{n,1}\left(h_{a,r},y\right)}{ \overline{\Sigma}_{n,11}(h_{a,r},y)^{½} } \right)^{+2} + p \left( \frac{\sqrt{n} \overline{m}_{n,2}\left(h_{a,r},y\right)}{\overline{\Sigma}_{n,22}(h_{a,r},y)^{½}} \right)^2\right),\] where \(\widehat{\mathcal{Y}} =\left[\min_{i=1,\dots,n}\widetilde{Y}_i, \max_{i=1,\dots,n}\widetilde{Y}_i \right]\), \(p\) is a parameter that weights the moments inequalities versus equalities and \((r_n)_{n\in\mathbb{N}}\) is a deterministic sequence tending to infinity.

\medskip To test for rational expectations in the absence of covariates, we simply restrict ourselves to the constant function \(h(X)=1\), and the test statistic is simply \[ T= \sup_{y \in \widehat{\mathcal{Y}}} \left(1-p\right)\left( - \frac{\sqrt{n} \overline{m}_{n,1}(y)}{\overline{\Sigma}_{n,11}(y)^{½} } \right)^{+2} + p \left( \frac{\sqrt{n} \overline{m}_{n,2}(y)}{\overline{\Sigma}_{n,22}(y)^{½}} \right)^2,\] where \(\overline{m}_{n,j}(y)\) and \(\overline{\Sigma}_{n,jj}(y)\) \((j=1,2)\) are defined as above but with \(h(x)=1\).

\medskip Whether or not covariates are included, the resulting test is of the form \(\varphi_{n,\alpha} = 1\left\{T> c^*_{n,\alpha}\right\}\) where the estimated critical value \(c^*_{n,\alpha}\) is obtained by bootstrap using as in AS the Generalized Moment Selection method. Specifically, we follow these three steps: \begin{enumerate} \item Compute the function $ \overline{\varphi}n\left(y,h\right) = \left(\overline{\varphi}{n,1}\left(y,h\right), 0 \right){\top} $ for $ (y,h) $ in $ \widehat{\mathcal{Y}}\times\cup{r=1}{r_n} \mathcal{H}{r} $, with \[ \overline{\varphi}_{n,1}\left(y,h\right) = \overline{\Sigma}_{n,11}^{½} B_n1\left\{ \frac{n^{½}}{\kappa_n} \overline{\Sigma}_{n,11}^{-½}\overline{m}_{n,1}(y,h) >1 \right\} , \] and where $ Bn = \left(b_0\ln(n)/\ln(\ln(n))\right){½} $, $ b_0 >0 $, $ \kappa_n =(\kappa\ln(n)){½}$, and \(\kappa>0\). To compute $\overline{\Sigma}{n,11}$, we fix \(\epsilon\) to \(0.05\), as in AS. \item Let \(\left(D_i^*,\widetilde{Y}_i^*, X_i^*\right)_{i=1,…,n}\) denote a bootstrap sample, i.e., an i.i.d. sample from the empirical cdf of \(\left(D,\widetilde{Y}, X\right)\), and compute from this sample $ \overline{m}n* $ and $ \overline{\Sigma}{n}* $. Then compute \(T^*\) like \(T\), replacing $ \overline{\Sigma}{n}\left(y,h{a,r}\right)$ and $ \sqrt{n}\overline{m}n\left(y,h{a,r}\right) $ by $ \overline{\Sigma}{n}*\left(y,h{a,r}\right)$ and \[\sqrt{n}\left( \overline{m}_n^* - \overline{m}_n \right)\left(y,h_{a,r}\right) + \overline{\varphi}_n\left(y,h_{a,r}\right).\] \item The threshold \(c^*_{n,\alpha}\) is the (conditional) quantile of order \(1-\alpha + \eta\) of \(T^* +\eta\) for some $ \eta >0 $. Following AS, we set \(\eta\) to $10{-6} $. \end{enumerate}

Minimal Deviations from Rational Expectations

In DGM, we also introduce the minimal deviations from rational expectations, as the unique function \(g^*\) satisfying: \[\mathbb{E}[\rho(|\psi-g^*(\psi)|)] = \inf_{(Y',\psi',\psi'‘)\in \mathbf{\Psi}}\mathbb{E}[\rho(|\psi’-\psi'‘|)].\] We refer the user to DGM for the motivation behind the computation of \(g^*\) and the formal existence and unicity result. Though \(g^*\) does not have a simple form in general, DGM propose in the following a simple procedure to construct a consistent estimator of it, based on i.i.d. copies \((Y_i)_{i=1…L}\) and \((\psi_i)_{i=1…L}\) of \(Y\) and \(\psi\). For simplicity, we suppose hereafter that the two samples have equal size. If both samples do not have equal size, one can first apply our analysis after taking a random subsample of the larger one, with the same size as the smaller one. Then we can compute the average of the estimates over a large number of such random subsamples.

\medskip To define the DGM's estimator, note first that we have \begin{equation}\label{eq:pbref} g*= \arg\min{g\in \mathcal{G}_0} \mathbb{E}\left[\left( \psi - g(\psi)\right)2 \right], \end{equation} where the set \(\mathcal{G}_0\) is defined by \[\mathcal{G}_0=\left\{g\ \text{non-decreasing}: \mathbb{E}\left[(y-Y)^+-(y-g(\psi))^+\right]\geq 0 \; \forall y\in\mathbb{R}, \; \mathbb{E}[g(\psi)] = \mathbb{E}[Y]\right\}.\] In other words, \(g^*\) is the (increasing) function such that (i) \(g^*(\psi)\) is closest to \(\psi\) for the \(L^2\) norm; (ii) \(g^*\) belongs to \(\mathcal{G}_0\), which means that we can rationalize \(\mathbb{E}(Y|g^*(\psi))=g^*(\psi)\).

\medskip To estimate \(g^*\), we replace expectations and cdfs by their empirical counterpart. Letting \(\psi_{(1)}<...< \psi_{(L)}\) denote the ordered statistic, our estimator of \(\left(g^*(\psi_{(1)}),...,g^*(\psi_{(L)})\right)\) is the solution of: \begin{align} \left(\widehat{g}(\psi_{(1)}),…,\widehat{g}^(\psi_{(L)})\right) = \arg\min{ \widetilde{\psi}{(1)} < \dots < \widetilde{\psi}{(L)} } \sum{i=1}L\left( \psi{(i)} - \widetilde{\psi}{(i)}\right)2 \ \text{s.t.} & \sum{i=j}L Y{(i)} - \widetilde{\psi}{(i)} \geq 0, \; j=2…L, \nonumber \ & \sum{i=1}L Y{(i)} - \widetilde{\psi}{(i)} = 0. \label{eq:prgmquad} \end{align} Then, for any \(t\in \mathbb{R}\), we let $$\widehat{g}(t)=\widehat{g}^\left(\min{(\psi_i){i=1…L}:\psii\geq \min{t,\psi{(L)}}\right).$$ We solve the above convex quadratic programming problem using the algorithm proposed in @suehiro2012online. We refer to DGM for more details.

The functions in the RationalExp package

The test function

Clean memory and load the package via

rm(list=ls())
### load packages
library(snowfall)
#> Loading required package: snow
library(RationalExp)
set.seed(1829384)

This function implements the RE tests proposed in DGM. The code of the test function is based on the Stata code cmi\_test from @andrews2017commands. The syntax of the function test is as follows: test(Y_tilde,D,X,weights,generalized,nbCores,tuningParam).

\begin{tabular}{lp{370pt}} Y\_tilde & a vector of size \(n_Y+n_\psi\) stacking first the \((Y_i)_{i=1,…,n_Y}\), then the \((\psi_i)_{i=1,…,n_\psi}\). \ D & a vector stacking the \((D_i)_{i=1…n}\): \(n_Y\) ones, then \(n_\psi\) zeros. \ X & the matrix of covariates. Equal to a vector of ones by default (in which case the test without covariates is performed). \ weights & the vector of survey weights. Uniform by default. \ generalized & whether a generalized test should be performed or not: “Add” for additive shocks (default), “Mult” for multiplicative shocks. Set by default to “No” (no generalized test). \ nbCores & the number of cores used by the program. To reduce the computational time, this function can use several cores, in which case the library snowfall should be loaded first. By default, nbCores is set to 1.\ tuningParam & a dictionnary, including the parameters p, epsil, B, c, kap, and y\_grid. The first four corresponds to the parameters \(p\), \(\epsilon\), \(B\), \(c\), and \(\kappa\) above (with default values equal to 0.05, 0.05, 500, 0.3 and 0.001 respectively). Following AS, the interval \(\widehat{\mathcal{Y}}=\left[\min_{i=1…n} \widetilde{Y}_i,\max_{i=1,…,n} \widetilde{Y}_i \right]\) is approximated by a grid denoted by y\_grid. By default, y\_grid is equal to the empirical quantiles of \(\widetilde{Y}\) of order \(0\), \(1/29\) \(2/29\),…, and 1. \end{tabular}

The estimDev function

This function estimates the minimal deviations from RE. The estimDev() function has the following syntax es timDev(psi, y)

\begin{tabular}{lp{370pt}} psi & vector of subjective expectations \ y & vector of realizations of an individual outcome. \end{tabular}

\medskip Both vectors should have the same length. If not, one can randomly select a subset of the longer vector with length equal to that of the shorter one. The function returns a function via the approxfun of the package stats. This function can then be evaluated directly on a desired grid. We give an exampl below.

Examples

Test without covariates

We consider the same DGP as in DGM (Section 5), namely we suppose that the outcome \(Y\) is given by \[ Y = \rho \psi + \epsilon,\] with \(\rho \in [0,1]\), \(\psi \sim \mathcal{N}(0,1)\) and \[\epsilon = \zeta \left(-1\{U \leq 0.1\} + 1\{U \geq 0.9\}\right),\] where \(\zeta\), \(U\) and \(\psi\) are mutually independent, \(\zeta \sim \mathcal{N}(2, 0.1)\) and \(U\sim \mathcal{U}[0,1]\). We consider 1,200 observations and \(\rho=0.29\).

### Data generating process
n_p=1200 # number of observations
n_y=n_p
N <- n_y + n_p
rho <-0.29 # parameter rho
sig=0.1 # parameter sigma
u=1
b=0.10
a=2
psi <-rnorm(n_p,0,u) ## vector of psi's
pp_y <- runif(n_y,0,1)
zeta <- rnorm(n_y,a,sig)
zeta1 <- rnorm(n_y,-a,sig)
pp1_y <- 1*(pp_y <b)
pp2_y <- 1*(pp_y >1-b)
pp3_y <- 1*(pp_y <=(1-b) & pp_y >=b)
psi_y <-rnorm(n_y,0,u)
y =  rho*psi_y+ pp1_y*zeta + pp2_y*zeta1 ## vector of y's

Concatentate the two datasets:

D <- rbind(matrix(1,n_y,1),matrix(0,n_p,1)) ## vector of D's
Y_tilde <- rbind(matrix(y,n_y,1),matrix(psi,n_p,1))  ## concatenation of y then psi

By default, the function test runs the test without covariates, where system.time() is used to compute the elapsed time:

system.time(res <- test(Y_tilde ,D))
#> Conditional Moment Inequalities Test   Number of obs :  2400 
#> Test Statistic :  4.697479 
#>  Critical Value (1%)  2.51336 
#> Critical Value (5%)  0.9971287 
#> Critical Value (10%)  0.7172106 
#> p-value  :  0
#>    user  system elapsed 
#>   31.27    0.04   76.70

The test prints the total number of observations, the test statistic, the different critical values, and the p-value. It returns a list containing all these informations and the vector of bootstraped test statistics (see the reference manual).

We now show how to modify the tuning parameter of the number of cores nbClust (in tuningParam) to 3 and run the test of the parallelized version of the test:

system.time(res <- test(Y_tilde ,D,NULL,NULL,NULL,3,NULL))
#> Warning in searchCommandline(parallel, cpus = cpus, type = type,
#> socketHosts = socketHosts, : Unknown option on commandline:
#> tools::buildVignettes(dir
#> R Version:  R version 3.4.4 (2018-03-15)
#> snowfall 1.84-6.1 initialized (using snow 0.4-3): parallel execution on 3 CPUs.
#> 
#> Stopping cluster
#> Conditional Moment Inequalities Test   Number of obs :  2400 
#> Test Statistic :  4.697479 
#>  Critical Value (1%)  2.532613 
#> Critical Value (5%)  1.28576 
#> Critical Value (10%)  0.8261835 
#> p-value  :  0
#>    user  system elapsed 
#>    0.89    0.22   36.44

Note that elapsed time has been divided by 1.63.

Then, we give a last example where we modify, where we modify the parameter “prec”

tuningParam<- vector(mode="list", length=6)
tuningParam[["p"]] <- 0.05
tuningParam[["epsilon"]] <- 0.05
tuningParam[["B"]] <-500
tuningParam[["y_grid"]] <- quantile(Y_tilde,seq(0,1,length.out=50))
tuningParam[["c"]] <- 0.3
tuningParam[["kappa"]] <- 0.001
system.time(res <- test(Y_tilde ,D,NULL,NULL,NULL,3,tuningParam))
#> Warning in searchCommandline(parallel, cpus = cpus, type = type,
#> socketHosts = socketHosts, : Unknown option on commandline:
#> tools::buildVignettes(dir
#> snowfall 1.84-6.1 initialized (using snow 0.4-3): parallel execution on 3 CPUs.
#> 
#> Stopping cluster
#> Conditional Moment Inequalities Test   Number of obs :  2400 
#> Test Statistic :  4.699084 
#>  Critical Value (1%)  2.718392 
#> Critical Value (5%)  1.269613 
#> Critical Value (10%)  0.8245907 
#> p-value  :  0.002004008
#>    user  system elapsed 
#>    0.81    0.15   50.16

Test with covariates

We now present an example of the test with covariates. We consider the same DGP as in DGM (Appendix D), namely we suppose that the outcome \(Y\) is given by the following DGP: \[ Y = \rho \psi + \sqrt{X}\epsilon,\] with \(\rho \in [0,1]\), \(\psi \sim \mathcal{N}(0,1)\), \(X\sim \text{Beta}(0.1, 10)\) and \[\epsilon= \zeta \left(-1\{U \leq 0.1\} + 1\{U \geq 0.9\}\right),\] where \(\zeta \sim \mathcal{N}(2, 0.1)\) and \(U\sim \mathcal{U}[0,1]\). \((\psi, \zeta, U, X)\) are supposed to be mutually independent. Again we start with the data generating process:

n_p=1200
n_y=n_p
N <- n_y + n_p
sig=0.1
u=1
b=0.10
a=2
alp = 0.1
bet = 10

# Data Generating process
X_p = rbeta(n_p,alp ,  bet)+1 
X_y =  rbeta(n_y,alp ,  bet)+1 
transf <- function(X_y,   f0){
  res <-f0*sqrt(X_y)
  return(res)
}
psi <-rnorm(n_p,0,u)
pp_y <- runif(n_y,0,1)
zeta <- rnorm(n_y,a,sig)
zeta1 <- rnorm(n_y,-a,sig)
pp1_y <- 1*(pp_y <b)
pp2_y <- 1*(pp_y >1-b)
pp3_y <- 1*(pp_y <=(1-b) & pp_y >=b)
psi_y <-rnorm(n_y,0,u)
y =  rho*psi_y+ transf(X_y, 1)*(pp1_y*zeta + pp2_y*zeta1)    

Concatentate the two datasets as above:

D <- rbind(matrix(1,n_y,1),matrix(0,n_p,1)) ## vector of D's
Y_tilde <- rbind(matrix(y,n_y,1),matrix(psi,n_p,1))## concatenation of y then psi

Then we concatenate the covariates by rows, those associated with y coming first.

X <- rbind(matrix(X_y,n_y,1),matrix(X_p,n_p,1))

Then run the test, after modifying the X parameter, the number of core (from 1 to 3), and the number of grid points (from 30 to 5):

tuningParam<- vector(mode="list", length=6)
tuningParam[["p"]] <- 0.05 # the parameter c in  Section 3 in DGM
tuningParam[["epsilon"]] <- 0.05 # the parameter c in  Section 3 in DGM
tuningParam[["B"]] <-500 #the number of MC replications
tuningParam[["y_grid"]] <- quantile(Y_tilde,seq(0,1,length.out=30))
tuningParam[["c"]] <- 0.3 # the parameter c in  Section 3 in DGM
tuningParam[["kappa"]] <- 0.001  #the parameter kappa  in  Section 3 in DGM

system.time(res <- test(Y_tilde ,D,X,NULL,NULL,3,tuningParam ))
#> Warning in searchCommandline(parallel, cpus = cpus, type = type,
#> socketHosts = socketHosts, : Unknown option on commandline:
#> tools::buildVignettes(dir
#> snowfall 1.84-6.1 initialized (using snow 0.4-3): parallel execution on 3 CPUs.
#> 
#> Stopping cluster
#> Conditional Moment Inequalities Test   Number of obs :  2400 
#> Test Statistic :  5.472244 
#>  Critical Value (1%)  1.586233 
#> Critical Value (5%)  1.006547 
#> Critical Value (10%)  0.7120133 
#> p-value  :  0
#>    user  system elapsed 
#>    0.97    0.17   74.66

T_n<-  res[[5]] 
p_value <- res[[7]]

Estimation of minimal deviations

The data generating process is the same as for the test without covariates in the previous section:

sig=0.1
u=1
b=0.10
a=2
rho= 0.4

psi <- rnorm(n_p,0,u)
pp_y <- runif(n_y,0,1)
zeta <- rnorm(n_y,a,sig)
zeta1 <- rnorm(n_y,-a,sig)
pp1_y <- 1*(pp_y <b)
pp2_y <- 1*(pp_y >1-b)
pp3_y <- 1*(pp_y <=(1-b) & pp_y >=b)
psi_y <-rnorm(n_p,0,u) 
y =  rho*psi_y+ pp1_y*zeta + pp2_y*zeta1    

Then we estimate \(g^*\) using the estimDev function, and the two vectors psi and y with same length

system.time(g_star <- estimDev(psi,y))
#>    user  system elapsed 
#>    6.37    0.02    8.84

We plot the result on a grid (we refer to @DGM for a detailled analysis and an enhanced plot):

t<- seq(-2.2,2.2, length.out=300)
plot( t, t- g_star(t),type="l",col=1 , lwd=2, xlim=c(-2.2,2.2), ylim=c(min(t- g_star(t))-0.1,max(t- g_star(t))+0.1))
abline(h=0)

plot of chunk unnamed-chunk-17

References