---
title: "Using fmrs package"
author: "Farhad Shokoohi"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Using fmrs package}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


# fmrs package in action

## Data generation
The function `fmrs.gendata` generates a data set from an FMRs model. 
It has the form

```
fmrs.gendata(nObs, nComp, nCov, coeff, dispersion, mixProp, rho, umax, ...)
```

where `n` is the sample size, `nComp` is the order of FMRs model, `nCov` is 
the number of regression covariates, `coeff`, `dispersion` and `mixProp` are 
the parameters of regression models, i.e. regression coefficients, 
dispersion (standard deviation) of the errors (sub-distributions) and 
mixing proportions, 
respectively, and `rho` is the used in the variance-covariance matrix for 
simulating the design matrix `x`, and `umax` is the upper bound for 
Uniform distribution for generating censoring times. 

Depending on the choice of `disFamily`, the function `fmrs.gendata` generates 
a simulated data from FMRs models. For instance, if we choose 
`disFamily = "norm"`, the function ignores the censoring parameter `umax` 
and generates a data set from an FMR model with Normal sub-distributions. 
However, if we choose `disFamily = "lnorm"` or `disFamily = "weibull"`, the 
function generates data under a finite mixture of AFT regression model with 
Log-Normal or Weibull sub-distributions. 

The `fmrs.gendata` function returns a list which includes a vector of 
responses `$y`, a matrix of covariates `$x` and a vector of censoring 
indicators `$delta` as well as the name of sub-distribution of the 
mixture model.


## MLE of FMRs models

The function `fmrs.mle` in fmrs package provides maximum likelihood 
estimation for the parameters of an FMRs model. The function has the 
following form,

```
fmrs.mle(y, x, delta, nComp, ...)
```

where `y`, `x` and `delta` are observations, covariates and censoring 
indicators respectively, and `nComp` is the order of FMRs, `initCoeff`, 
`initDispersion` and `initmixProp` are initial values for EM and NR 
algorithms, and the rest of arguments of the function are controlling 
parameres. The function returns a fitted FMRs model with estimates of 
regression parameters, standard deviations and mixing proportions of the 
mixture model.
It also returns the log-likelihood and BIC under the estimated model, the 
maximum number of iterations used in EM algorithm and the type of the 
fitted model.

Note that one can do ridge regression by setting a value for tuning 
parameter of the ridge penalty  other than zero in the argument `lambRidge`. 


## Variable selection in FMRs models

To do the variable selection we provided the function `fmrs.varsel` 
with the form

```
fmrs.varsel(y, x, delta, nComp, ...)
```

where `penFamily` is the penalty including `"adplasso"`, `"lasso"`, `"mcp"`, 
`"scad"`, `"sica"` and `"hard"`, and `lambPen` is the set of tuning 
parameters for components of penalty. We can run the function `fmrslme` 
first and use the parameter estimates as initial values for the 
function `fmrs.varsel`. 

### Choice of tuning parameter

There are two approaches for specifying tuning parameters: Common and 
Component-wise tuning parameters. If we consider choosing common tuning 
parameter, we can use the BIC criteria to search through the a set of 
candidate values in the interval $(0,\lambda_M)$, where $\lambda_M$ is a 
prespecified number. The BIC is provided by the function `fmrs.varsel` for 
each candidate point and we choose the optimal $\hat\lambda$, the one that 
maximizes BIC. This approach will be good for the situations with enough 
samples sizes. It also reduces the computational burden. 

On the other hand, if we consider choosing component-wise tuning parameters 
we use the following function to search for optimal 
$(\lambda_1, \ldots, \lambda_K)$ from the set of candidate values in 
$(0, \lambda_M)$. The function is

```
fmrs.tunsel(y, x, delta, nComp, ...)
```

It is a good practice run the function `fmrs.mle` first and use the parameter 
estimates as initial values in the function `fmrs.tunsel`.
The function `fmrs.mle` then returns a set optimal tuning parameters of 
size `nComp` to be used in variable selection function. Note that this 
approach still is under theoretical study and is not proved to give optimal 
values in general. 


## Example: finite mixture of AFT regression model (Log-Normal)

We use a simulated data set to illustrate using `fmrs` package. We generate 
the covariates (design matrix) from a multivariate normal distribution of 
dimension `nCov=10` and sample size 200 with mean vector $\bf 0$ and 
variance-covariance matrix $\Sigma=(0.5^{|i-j|})$. We then generate 
time-to-event data from a finite mixture of two components AFT regression 
models with Log-Normal sub-distributions. The mixing proportions are set 
to $\pi=(0.3, 0.7)$. We choose $\boldsymbol\beta_0=(2,-1)$ as the 
intercepts, $\boldsymbol\beta_1=(-1, -2, 1, 2, 0 , 0, 0, 0, 0, 0)$ and 
$\boldsymbol\beta_2=(1, 2, 0, 0, 0 , 0, -1, 2, -2, 3)$ as the regression 
coefficients in first and second component, respectively. 

We start by loading necessary libraries and assigning the parameters of 
model as follows.

```{r}
library(fmrs)
set.seed(1980)
nComp = 2
nCov = 10
nObs = 500
dispersion = c(1, 1)
mixProp = c(0.4, 0.6)
rho = 0.5
coeff1 = c( 2,  2, -1, -2, 1, 2, 0, 0,  0, 0,  0)
coeff2 = c(-1, -1,  1,  2, 0, 0, 0, 0, -1, 2, -2)
umax = 40
```

Using the function `fmrs.gendata`, we generate a data set from a finite 
mixture of accelerated failure time regression models with above settings as 
follow. 

```{r}
dat <- fmrs.gendata(nObs = nObs, nComp = nComp, nCov = nCov,
                     coeff = c(coeff1, coeff2), dispersion = dispersion,
                     mixProp = mixProp, rho = rho, umax = umax,
                     disFamily = "lnorm")
```

Now we assume that the generated data are actually real data. We find MLE of 
the parameters of the assumed model using following code. Note that almost 
all mixture of regression models depends on initial values. Here we 
generate the initial values form a Normal distribution with 

```{r}
res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta,
                   nComp = nComp, disFamily = "lnorm",
                   initCoeff = rnorm(nComp*nCov+nComp),
                   initDispersion = rep(1, nComp),
                   initmixProp = rep(1/nComp, nComp))
coefficients(res.mle)
dispersion(res.mle)
mixProp(res.mle)
```

As we see the ML estimates of regression coefficients are not zero. 
Therefore MLE cannot provide a sparse solution. In order to provide a sparse 
solution, we use the variable selection methods developed by 
Shokoohi et. ale. (2016). First we need to find a good set of tuning 
parameters.
It can be done by using component-wise tuning parameter selection function 
implemented in `fmrs` as follows. In some settings, however, it is better to 
investigate if common tuning parameter performs better. 

```{r}
res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta,
                      nComp = nComp, disFamily = "lnorm",
                      initCoeff = c(coefficients(res.mle)),
                      initDispersion = dispersion(res.mle),
                      initmixProp = mixProp(res.mle),
                      penFamily = "adplasso")
show(res.lam)
```

We have used MLE estimates as initial values for estimating the tuning 
parameters. Now we used the same set of values to do variable selection with 
adaptive lasso penalty as follows. 

```{r}
res.var <- fmrs.varsel(y = dat$y, x = dat$x, delta = dat$delta,
                      nComp = ncomp(res.mle), disFamily = "lnorm",
                      initCoeff=c(coefficients(res.mle)),
                      initDispersion = dispersion(res.mle),
                      initmixProp = mixProp(res.mle),
                      penFamily = "adplasso",
                      lambPen = slot(res.lam, "lambPen"))
coefficients(res.var)
dispersion(res.var)
mixProp(res.var)
```

The final variables that are selected using this procedure are those with 
non-zero coefficients. Note that a switching between components of mixture 
has happened here.   

```{r}
round(coefficients(res.var)[-1,],3)
```

Therefore the variable selection and parameter estimation is done 
simultaneously using the fmrs package.