We implement a double RKHS variant of PLS, where both the input and the output spaces are endowed with reproducing kernels:
We use centered Grams \(\tilde K_X = H K_X H\) and \(\tilde K_Y = H K_Y H\), where \(H = I - \frac{1}{n}\mathbf{1}\mathbf{1}^\top\).
Following the spirit of Kernel PLS Regression II (IEEE TNNLS, 2019), we avoid explicit square roots and form the SPD surrogate operator \[ \mathcal{M} \, v = (K_X+\lambda_x I)^{-1} \; K_X \; K_Y \; K_X \; (K_X+\lambda_x I)^{-1} \, v, \] with small ridge \(\lambda_x > 0\) for stability. We compute the first \(A\) orthonormal latent directions \(T = [t_1,\dots,t_A]\) via power iteration with Gram–Schmidt orthogonalization on \(\mathcal{M}\).
We then solve a small regression in the latent space: \[ C = (T^\top T)^{-1} (T^\top \tilde Y), \qquad \tilde Y = Y - \mathbf{1} \bar y^\top, \] and form dual coefficients \[ \alpha \;=\; U \, C, \qquad U \;=\; (K_X+\lambda_x I)^{-1} T, \] so that training predictions satisfy \[ \hat Y \;=\; \tilde K_X \, \alpha + \mathbf{1}\,\bar y^\top . \]
Given new inputs \(X_\*\), define the cross-Gram \[ K_\* = K(X_\*, X) . \] To apply training centering to \(K_\*\), use \[ \tilde K_\* \;=\; K_\* \;-\; \mathbf{1}\, \bar k_X^\top \;-\; \bar k_\* \mathbf{1}^\top \;+\; \mu_X, \] where: - \(\bar k_X = \frac{1}{n}\mathbf{1}^\top K_X\) is the column mean vector for the (uncentered) training Gram, - \(\mu_X = \frac{1}{n^2} \mathbf{1}^\top K_X \mathbf{1}\) is its grand mean, - \(\bar k_\*\) is the row mean of \(K_\*\) (computed at prediction time).
Predictions then follow the familiar dual form: \[ \hat Y_\* \;=\; \tilde K_\* \, \alpha + \mathbf{1}_\* \, \bar y^\top . \]
algorithm = "rkhs_xy", the package returns:
dual_coef \(=\alpha\),scores \(=T\)
(approximately orthonormal),intercept \(=\bar
y\),predict().library(bigPLSR)
set.seed(42)
n <- 60; p <- 6; m <- 2
X <- matrix(rnorm(n * p), n, p)
Y <- cbind(sin(X[,1]) + 0.4 * X[,2]^2,
cos(X[,3]) - 0.3 * X[,4]^2) + matrix(rnorm(n*m, sd=.05), n, m)
op <- options(
bigPLSR.rkhs_xy.kernel_x = "rbf",
bigPLSR.rkhs_xy.gamma_x = 0.5,
bigPLSR.rkhs_xy.kernel_y = "linear",
bigPLSR.rkhs_xy.lambda_x = 1e-6,
bigPLSR.rkhs_xy.lambda_y = 1e-6
)
on.exit(options(op), add = TRUE)
fit <- pls_fit(X, Y, ncomp = 3, algorithm = "rkhs_xy", backend = "arma")
Yhat <- predict(fit, X)
mean((Y - Yhat)^2)
#> [1] 2.619847e-12References • Rosipal & Trejo (2001) Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. JMLR 2:97–123. doi:10.5555/944733.944741. • Kernel PLS Regression II: Kernel Partial Least Squares Regression by Projecting Both Independent and Dependent Variables into Reproducing Kernel Hilbert Space. IEEE TNNLS (2019). doi:10.1109/TNNLS.2019.2932014.