This package is designed to help draw the Hotelling ellipse on the PCA or PLS score scatterplot. HotreellingEllipse computes the Hotelling’s T\(^2\) value, the semi-minor axis, the semi-major axis along with the x-y coordinate points for drawing a confidence ellipse based on Hotelling’s T\(^2\). Specifically, there are two functions available:

Data

library(HotellingEllipse)
data("specData")

Principal component analysis

In this example, we use FactoMineR::PCA() to perform the Principal Component Analysis (PCA) from a LIBS spectral dataset specData and extract the PCA scores as a data frame tibble::as_tibble().

set.seed(123)
pca_mod <- specData %>%
  select(where(is.numeric)) %>%
  PCA(scale.unit = FALSE, graph = FALSE)
pca_scores <- pca_mod %>%
  pluck("ind", "coord") %>%
  as_tibble()
pca_scores
#> # A tibble: 100 × 5
#>      Dim.1    Dim.2  Dim.3   Dim.4 Dim.5
#>      <dbl>    <dbl>  <dbl>   <dbl> <dbl>
#>  1 25306.  -10831.  -1851.   -83.4 -560.
#>  2   -67.3   1137.  -2946.  2495.  -568.
#>  3 -1822.     -22.0 -2305.  1640.  -409.
#>  4 -1238.    3734.   4039. -2428.   379.
#>  5  3299.    4727.   -888. -1089.   262.
#>  6  5006.     -49.5  2534.  1917.  -970.
#>  7 -8325.   -5607.    960. -3361.   103.
#>  8 -4955.   -1056.   2510.  -397.  -354.
#>  9 -1610.    1271.  -2556.  2268.  -760.
#> 10 19582.    2289.    886.  -843.  1483.
#> # … with 90 more rows

Hotelling ellipse: semi-axes

To add a confidence ellipse, we use the function ellipseParam(). We want to compute the length of the ellipse semi-axes for bivariate data within the PC1-PC2 subspace. To do this, we set the number of components, k, to 2, while the pcx and pcy inputs are respectively set to 1 and 2.

res <- ellipseParam(data = pca_scores, k = 2, pcx = 1, pcy = 2)
str(res)
#> List of 4
#>  $ Tsquare     : tibble [100 × 1] (S3: tbl_df/tbl/data.frame)
#>   ..$ value: num [1:100] 13.8 2.08 1.06 2.82 1.4 ...
#>  $ Ellipse     : tibble [1 × 4] (S3: tbl_df/tbl/data.frame)
#>   ..$ a.99pct: num 19369
#>   ..$ b.99pct: num 10800
#>   ..$ a.95pct: num 15492
#>   ..$ b.95pct: num 8639
#>  $ cutoff.99pct: num 9.76
#>  $ cutoff.95pct: num 6.24

We can extract parameters for further use:

a1 <- pluck(res, "Ellipse", "a.99pct")
b1 <- pluck(res, "Ellipse", "b.99pct")
a2 <- pluck(res, "Ellipse", "a.95pct")
b2 <- pluck(res, "Ellipse", "b.95pct")
Tsq <- pluck(res, "Tsquare", "value")

Hotelling ellipse: x and y coordinates

Another way to add Hotelling ellipse is to use the function ellipseCoord(). This function provides the x and y coordinates of the confidence ellipse at user-defined confidence interval. The confidence interval confi.limit is set at 95% by default. Below, the x-y coordinates are estimated based on data projected into the PC1-PC3 subspace.

xy_coord <- ellipseCoord(data = pca_scores, pcx = 1, pcy = 3, conf.limit = 0.95, pts = 500)
str(xy_coord)
#> tibble [500 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ x: num [1:500] 15492 15491 15488 15481 15473 ...
#>  $ y: num [1:500] -5.05e-13 8.48e+01 1.70e+02 2.54e+02 3.39e+02 ...