---
title: "Panel Smooth Transition Regression"
author: "Yukai Yang"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{PSTR Vignette}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```


# PSTR version 2.0.0 (Green Panel)

The PSTR package implements the Panel Smooth Transition Regression (PSTR) modelling.

The modelling procedure consists of three stages: Specification, Estimation and Evaluation. The package offers tools helping the users to conduct model specification tests, to do PSTR model estimation, and to do model evaluation.

The cluster-dependency and heteroskedasticity-consistent tests are implemented in the package.

The wild bootstrap and cluster wild bootstrap tests are also implemented.

Parallel computation (as an option) is implemented in some functions, especially the bootstrap tests. Therefore, the package suits tasks running many cores on super-computation servers.

## How to install

You can either install the stable version from CRAN
```{r install1, eval=F}
install.packages("PSTR")
```
or install the development version from GitHub
```{r install2, eval=F}
devtools::install_github("yukai-yang/PSTR")
```
or
```{r install3, eval=F}
remotes::install_github("yukai-yang/PSTR")
```
provided that the package `devtools` or `remotes` has been installed beforehand.

## Example

After installing the package, you need to load (attach better say) it by running the code
```{r attach}
library(PSTR)
```

You can first check the information and the current version number by running
```{r version}
version()
```

Then you can take a look at all the available functions and data in the package
```{r contents}
ls( grep("PSTR", search()) ) 
```

### The data

In the package, a data set called `Hansen99` is offered to give prompt example. For details of the data set, you can run
```{r data, eval=F}
?Hansen99 
```

### Initialization

You can create a new object of the class PSTR by doing
```{r new}
pstr = NewPSTR(Hansen99, dep='inva', indep=4:20, indep_k=c('vala','debta','cfa','sales'),
               tvars=c('vala'), im=1, iT=14)
pstr
```
It says that the data set `Hansen99` is used, the dependent variable is `inva`, the variables in the data from column 4 to 20 are the explanatory variables in the linear part (though you can write down the names of them), the explanatory variables in the nonlinear part are the four ones in `indep_k`, and the potential transition variable is `vala` (Tobin's Q).

Now you can see that the `NewPSTR` is basically defining the settings of the model.

Note that you can print the object of the class `PSTR`. By default, it gives you a summary of the PSTR model. They are mainly about which one is the dependent variable, which ones are explanatory variables and etc..

### Specification

The following code does linearity tests
```{r lintest1}
LinTest(pstr) 
print(pstr, mode="tests")
```
or
```{r lintest2, eval=F}
pstr$LinTest() 
print(pstr, mode="tests")
```
`LinTest(pstr)` modifies the object in place (R6 uses reference semantics). The function returns `pstr` invisibly, so you may call it either for its side effects or in an assignment.
The reference semantics means that the pass of the object to the function leads to a change of the values inside the object.

You can create a new PSTR objects (the same model with the same data set) based on another object by running
```{r clone}
pstr0 = pstr$clone()
```
By doing so, the new PSTR object `pstr0` is an independent copy of the original object, and when you do something on either of the two objects (`pstr` or `pstr0`), the other one remains unchanged.

You can do, for example, the wild bootstrap and wild cluster bootstrap by running the following code.
```{r lintest3, eval=F}
iB = 5000 # the number of repetitions in the bootstrap
library(snowfall)
WCB_LinTest(pstr,iB=iB,parallel=T,cpus=50)
```
It takes a long long time to run the bootstrap. This function is developed for those who work on some super-computation server with many cores and a large memory. Note that you will have to attach the `snowfall` package manually.

You can try the following code on your own computer by reducing the number of repetitions and cores.
```{r lintest4, eval=F}
WCB_LinTest(pstr,iB=4,parallel=T,cpus=2)
```


### Estimation

After selecting the transition variable (in this example, `vala`), you can estimate the PSTR model.
```{r estimate, eval=F}
EstPSTR(use=pstr,im=1,iq=1,useDelta=T,par=c(-0.462,0), vLower=4, vUpper=4)
print(pstr, mode="estimates")
```

By default, the `optim` method `L-BFGS-B` is used, but you can change the method for estimation by doing
```{r estimate1}
EstPSTR(use=pstr,im=1,iq=1,useDelta=T,par=c(-0.462,0), method="CG")
print(pstr, mode="estimates")
```

The argument `useDelta` determines the type of the initial value for the smoothness parameter. By default `useDelta = FALSE` means that the first initial value in `par` is the `gamma` instead of `delta`. Here we use the settings `useDelta = TRUE` and `par = c(1.6, .5)` means that the first value of `par` is the `delta` and its value is 1.6. Note that `delta` and `gamma` have the relationship `gamma = exp(delta)`. Thus, the following two sentences are equivalent
```{r estimate2, eval=F}
EstPSTR(use=pstr,im=1,iq=1,useDelta=T,par=c(-0.462,0), method="CG")
EstPSTR(use=pstr,im=1,iq=1,par=c(exp(-0.462),0), method="CG")
```

Note that the estimation of a linear panel regression model is also implemented. The user can do it by simply running
```{r estimate3}
EstPSTR(use=pstr0)
print(pstr0, mode="estimates")
```

### Evaluation

The evaluation tests can be conducted after estimating a nonlinear PSTR model.
```{r evaluation, eval=F}
## evaluation tests
EvalTest(use=pstr,vq=as.matrix(Hansen99[,'vala'])[,1])
```  
Note that `EvalTest` takes one transition variable `vq` at a time for the no remaining nonlinearity (heterogeneity) test. This differs from `LinTest`, which can evaluate several candidate transition variables through `tvars`.

The user can also run the wild bootstrap (WB) and wild cluster bootstrap (WCB) versions of the evaluation tests, provided that suitable computing resources are available.
```{r evaluation1, eval=F}
iB = 5000
cpus = 50

## wild bootstrap time-varying evaluation test 
WCB_TVTest(use=pstr,iB=iB,parallel=T,cpus=cpus)

## wild bootstrap heterogeneity evaluation test
WCB_HETest(use=pstr,vq=as.matrix(Hansen99[,'vala'])[,1],iB=iB,parallel=T,cpus=cpus)

print(pstr, mode="evaluation")
```
Note that the evaluation functions do not accept the returned object `pstr0` from a linear panel regression model, as the evaluation tests are designed for the estimated PSTR model but not a linear one.


### Plotting

After estimating the PSTR model, you can plot the estimated transition function by running
```{r plot_trans1}
plot_transition(pstr)
```

or a better plot with more arguments
```{r plot_trans2}
plot_transition(pstr, fill='blue', xlim=c(-2,20), color = "dodgerblue4", size = 2, alpha=.3) +
  ggplot2::geom_vline(ggplot2::aes(xintercept = pstr$c - log(1/0.95 - 1)/pstr$gamma),color='blue') +
  ggplot2::labs(x="customize the label for x axis",y="customize the label for y axis",
       title="The Title",subtitle="The subtitle",caption="Make a caption here.")
```

You can also plot the curves of the coefficients, the standard errors and the p-values against the transition variable.
```{r plot_coef}
ret = plot_coefficients(pstr, vars=1:4, length.out=100, color="dodgerblue4", size=2)
ret[[1]]
```

The plotting function `plot_response`, which depicts the relationship between
\begin{equation*}
[\phi_0 + \phi_1 g_{it}(q_{it} ; \gamma, c)] x_{it}
\end{equation*}
which we refer to as the response, as a function of an explanatory variable $x_{it}$ and the transition variable $q_{it}$.

The response $[\phi_0 + \phi_1 g_{it}(q_{it} ; \gamma, c)] x_{it}$ is actually the contribution of the variable $x_{it}$ to the conditional expectation of the dependent variable $y_{it}$ through the smooth transition mechanism.

We can see that the response against the variable is a straight line if there is no nonlinearity.
If $x_{it}$ and $q_{it}$ are distinct, the response can be visualised as a surface over $(x_{it}, q_{it})$, where the vertical axis represents the response.
And it becomes a curve if the variable $x_{it}$ and the transition variable $q_{it}$ are identical.

You can generate these plots by running
```{r plot0}
ret = plot_response(obj=pstr, vars=1:4, log_scale = c(F,T), length.out=100)
```
`ret` takes the return value of the function. We make the graphs for all the four variables in nonlinear part by using `vars=1:4` (variable names can also be used for specification). Note that we do not do it for the variables in the linear part, as they produce straight lines or planes.
`log_scale` is a length-2 logical vector indicating whether to apply a log scale to (i) $x_{it}$ and (ii) $q_{it}$, respectively.
`length.out` gives the number of points in the grid for producing the surface or curve. A `length.out` of 100 points looks fine enough.

You might ask: what if you want different log-scaling choices for different variables?
The solution is to make the graphs separately by running something like
```{r plot2, eval=F}
ret1 = plot_response(obj=pstr, vars=1, log_scale = c(F,T), length.out=100)
ret2 = plot_response(obj=pstr, vars=2, log_scale = c(T,T), length.out=100)
```

Let us take a look at the elements in `ret`
```{r}
attributes(ret)
```
We see that `ret` is a list containing elements whose names are the variables' names that we specified when running `plot_response`.

Yes, but each element is a ggplot object, so you can display it directly, for example:
```{r vala, message=F}
ret$vala
```

The numbers on the x-axis look not so good as it is difficult to find where the turning-point is.

The `ggplot2` package allows us to manually paint the numbers (the PSTR package collaborates very well with some prevailing packages), and even the label on x-axis (and many more).

```{r vala2, message=F}
ret$vala + ggplot2::scale_x_log10(breaks=c(.02,.05,.1,.2,.5,1,2,5,10,20)) +
    ggplot2::labs(x="Tobin's Q")
```

Now we see very clearly that the estimated turning-point approximately 0.5 splits the curve into two regimes, and the two regimes behave so differently. This graph is about the lagged Tobin's Q's contribution to the expected investment. Low Q firms (whose potentials are evaluated to be low by the financial market) look rather reluctant to change their future investment plan, or maybe get changed.

Then let us proceed to the surfaces. Check the response from the debta by running
```{r debta, eval=F}
ret$debta
```

In an interactive R session, the surface can be rotated and zoomed using the mouse.
"vala_y" shows that the y-axis is the Q, and "debta_x" shows that the x-axis is the debt. The tool bar on up-right helps you to rotate, pan, zoom and save the graph.

Note that the transition variable Q is in log scale while debt is not.

It is very clear that low Q firms' future investment will be affected by the current debt situation. The more debt there is, the less investment there will be.
However, this is not the case for high-Q firms, which appear to have better growth prospects and are less sensitive to debt.

The following two living graphs are for the cash flow and the sales.

```{r cfa, eval=F}
ret$cfa
```

```{r sales, eval=F}
ret$sales
```


## Citation

If you use the PSTR package in your research, please cite both the software implementation and the underlying methodology.

**Software**

Yang, Y. (2026). PSTR: Panel Smooth Transition Regression Modelling. R package version 2.0.0 (first released in 2017). Available at: https://github.com/yukai-yang/PSTR

**Methodology**

González, A., Teräsvirta, T., van Dijk, D., and Yang, Y. (2005). Panel Smooth Transition Regression Models. EFI Working Paper Series in Economics and Finance, No. 604. Stockholm School of Economics. Revised October 2017. Available at: https://swopec.hhs.se/hastef/papers/hastef0604.pdf

You can obtain the citation information directly from R by running:
```{r citation}
citation("PSTR")
```