In nonparametric estimation of the autocovariance matrices or the spectral density matrix of a second-order stationary multivariate time series, it is important to preserve positive definiteness of the estimator. This in order to ensure interpretability of the estimator as an estimated covariance or spectral matrix, but also to avoid computational issues in e.g. simulation or bootstrapping procedures.
For this purpose, in (Chau and von Sachs 2017) we considered multivariate spectral estimation in the Riemannian manifold of Hermitian and positive definite matrices based on a geometric wavelet approach. Nonlinear wavelet curve denoising in the Riemannian manifold allows one to capture local smoothness behavior of the spectral matrix across frequency, but also varying degrees of smoothness across components of the spectral matrix. Moreover, and in contrast to existing approaches, due to its geometric nature the wavelet-based spectral estimator enjoys the important property that it is invariant to permutations of the components of the time series.
In addition to spectral estimation, we proposed computationally fast clustering of spectral matrices based on their representations in the wavelet domain, exploiting the fact that smooth curves in the Riemannian manifold are summarized by few high-energy wavelet coefficients.
In this vignette we demonstrate the wavelet-based multivariate spectral estimation and clustering procedures of (Chau and von Sachs 2017) by means of simulated time series data using the functions in the pdSpecEst
package.
rARMA()
With rARMA()
we simulate multivariate time series observations from a vector ARMA(2,2) process based on Gaussian white noise (i.e. a simulated autoregressive moving average process). By including a vector of frequencies freq
, the function rARMA()
also returns the generating spectrum of the time series.
library(pdSpecEst)
## Fix parameters
freq <- seq(from = pi / 2^9, to = pi, length = 2^9)
d <- 2
Phi <- array(c(0.5, 0, 0, 0.2, 0, 0, 0, -0.9), dim = c(d, d, 2))
Theta <- array(c(0, 0.1, 0.1, 0, 0, 0, 0, 0.5), dim = c(d, d, 2))
Sigma <- matrix(c(2, 0, 0, 0.25), nrow = d)
## Generate time series
set.seed(0)
ts.sim <- rARMA(2^11, d, Phi, Theta, Sigma, freq = freq)
str(ts.sim)
#> List of 2
#> $ X: num [1:2048, 1:2] -1.79 0.856 1.554 0.338 0.124 ...
#> $ f: cplx [1:2, 1:2, 1:512] 1.2747+0i 0.0445+0.0003i 0.0445-0.0003i ...
The autoregressive (AR) and moving average (MA) parameters are chosen in such a way that the spectral matrix displays different degrees of (local) smoothness across different components of the spectral matrix. The code below can be used to plot the simulated time series data and the underlying spectral matrix in the frequency domain.
## Plot time series observations
par(mfrow = c(2,1), mar = c(4.5, 3, 2, 2))
invisible(sapply(1:d, function(i) plot(ts.sim$X[, i], main = paste0("Component ", i), type = "l", xlab = "Time", ylab = "")))
## Plot spectral matrix
layout(mat = matrix(c(1,1,2,3,4,5,6,6), nrow = 4))
plotspec <- function(i){
if(i[1] == i[2]){
plot(freq, Re(ts.sim$f[i[1], i[1], ]), main = paste0("Auto-spectrum (", i[1], ", ", i[1], ")"), type = "l", xlab = "Frequency", ylab = "")
} else{
plot(freq, Re(ts.sim$f[i[1], i[2], ]), main = paste0("Real cross-spectrum (", i[1], ", ", i[2], ")"), type = "l", xlab = "Frequency", ylab = "")
plot(freq, Im(ts.sim$f[i[1], i[2], ]), main = paste0("Imag. cross-spectrum (", i[1], ", ", i[2], ")"), type = "l", xlab = "Frequency", ylab = "")
}
}
invisible(apply(expand.grid(1:d, 1:d), 1, plotspec))
pdPgram()
pdPgram()
computes an initial noisy spectral estimator based on an averaged periodogram matrix, rescaled by the manifold bias-correction described in (Chau and von Sachs 2017). If B >= d
, i.e. the number of segments over which the averaged periodogram is computed is larger than or equal to the dimension of the time series, then the initial spectral estimator is guaranteed to be Hermitian positive definite (HPD).
pgram <- pdPgram(ts.sim$X)
str(pgram)
#> List of 2
#> $ freq: num [1:512] 0.00614 0.01227 0.01841 0.02454 0.03068 ...
#> $ P : cplx [1:2, 1:2, 1:512] 0.683+0i 0.023+0.0259i 0.023-0.0259i ...
pdSpecEst()
pdSpecEst()
computes the HPD wavelet-denoised spectral estimator by thresholding the wavelet coefficients of an initial noisy HPD spectral estimator in the manifold wavelet domain. The components of the wavelet coefficients are thresholded based on the hard (keep-or-kill) threshold lam
. If lam
is unspecified, the threshold is determined data-adaptively by a twofold cross-validation procedure.
f.hat <- pdSpecEst(pgram$P)
str(f.hat)
#> List of 4
#> $ f : cplx [1:2, 1:2, 1:512] 1.1341+0i 0.0392-0.019i 0.0392+0.019i ...
#> $ D :List of 9
#> ..$ M.scale1: cplx [1:2, 1:2, 1:2] 0.524+0i 0.039-0.0001i 0.039+0.0001i ...
#> ..$ D.scale1: cplx [1:2, 1:2, 1:2] -0.1409+0i 0.0925+0.0294i 0.0925-0.0294i ...
#> ..$ D.scale2: cplx [1:2, 1:2, 1:4] 0.0624+0i 0.0845+0.0586i 0.0845-0.0586i ...
#> ..$ D.scale3: cplx [1:2, 1:2, 1:8] 0+0i 0+0i 0+0i ...
#> ..$ D.scale4: cplx [1:2, 1:2, 1:16] 0+0i 0+0i 0+0i ...
#> ..$ D.scale5: cplx [1:2, 1:2, 1:32] 0+0i 0+0i 0+0i ...
#> ..$ D.scale6: cplx [1:2, 1:2, 1:64] 0+0i 0+0i 0+0i ...
#> ..$ D.scale7: cplx [1:2, 1:2, 1:128] 0+0i 0+0i 0+0i ...
#> ..$ D.scale8: cplx [1:2, 1:2, 1:256] 0+0i 0+0i 0+0i ...
#> $ lam : num 2.82
#> $ components:List of 2
#> ..$ thresholded :List of 8
#> .. ..$ : num [1:4, 1:2] -2.869 -0.847 2.664 19.139 3.592 ...
#> .. ..$ : num [1:4, 1:4] 0.863 -1.146 1.652 -3.502 -2.195 ...
#> .. ..$ : num [1:4, 1:8] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ : num [1:4, 1:16] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ : num [1:4, 1:32] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ : num [1:4, 1:64] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ : num [1:4, 1:128] 0 0 0 0 0 0 0 0 0 0 ...
#> .. ..$ : num [1:4, 1:256] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ not_thresholded:List of 8
#> .. ..$ : num [1:4, 1:2] -2.869 -0.847 2.664 19.139 3.592 ...
#> .. ..$ : num [1:4, 1:4] 0.863 -1.146 1.652 -3.502 -2.195 ...
#> .. ..$ : num [1:4, 1:8] 0.733 2.107 1.367 -0.206 0.672 ...
#> .. ..$ : num [1:4, 1:16] 0.1839 -0.0729 -1.9405 1.5717 -0.4328 ...
#> .. ..$ : num [1:4, 1:32] -1.452 0.104 0.989 0.331 -1.599 ...
#> .. ..$ : num [1:4, 1:64] 1.224 0.753 -0.929 0.766 -0.398 ...
#> .. ..$ : num [1:4, 1:128] -0.3187 -0.8156 0.2151 1.1085 -0.0757 ...
#> .. ..$ : num [1:4, 1:256] 0.73 0.379 0.242 -1.559 1.088 ...
The figure below shows the true underlying spectral matrix given by ts.sim$f
and the estimated spectral matrix obtained from f.hat$f
at the frequencies freq
. The spectral estimator is able to capture both the very smooth curve behavior in the first auto-spectral component of the matrix and the localized peak in the second auto-spectral component of the matrix, while guaranteeing positive definiteness of the estimator.
Figure: True (left) and estimated (right) spectral matrices in the frequency domain.
pdSpecClust()
pdSpecClust()
performs clustering of multivariate spectral matrices via a two-step fuzzy clustering algorithm in the manifold wavelet domain. Below we simulate time series data for ten different subjects from two slightly different vARMA(2,2) processes. Here, the first group of five subjects shares the same spectrum and the second group of five subjects share a slightly different spectrum. We use pdSpecClust()
to assign the different subjects to K=2
clusters in a probabilistic fashion. Note that the true clusters are formed by the first group of five subjects and the last group of five subjects.
Phi1 <- array(c(0.5, 0, 0, 0.1, 0, 0, 0, -0.9), dim = c(d, d, 2))
Phi2 <- array(c(0.5, 0, 0, 0.3, 0, 0, 0, -0.9), dim = c(d, d, 2))
pgram <- function(Phi) pdPgram(rARMA(2^10, d, Phi, Theta, Sigma)$X)$P
P <- array(c(replicate(5, pgram(Phi1)), replicate(5, pgram(Phi2))), dim=c(d, d, 2^8, 10))
pdSpecClust(P, K = 2, lam = 3)
#> Cluster1 Cluster2
#> Subject1 0.09306581 0.90693419
#> Subject2 0.01732452 0.98267548
#> Subject3 0.01527921 0.98472079
#> Subject4 0.02611720 0.97388280
#> Subject5 0.02951176 0.97048824
#> Subject6 0.98101913 0.01898087
#> Subject7 0.98230827 0.01769173
#> Subject8 0.97573420 0.02426580
#> Subject9 0.96717490 0.03282510
#> Subject10 0.97144733 0.02855267
A demo Shiny app for wavelet-based spectral estimation and clustering is available here. The app allows the user to test the wavelet-based spectral matrix estimation or wavelet-based spectral matrix clustering procedures on simulated multivariate time series data. The estimated spectral matrices and cluster assignments can be compared to the known generating spectral matrices and clusters, or in the case of the wavelet-based spectral estimate with a benchmark multitaper spectral estimate (obtained via pdPgram()
).
The app also includes a real brain signal data example consisting of local field potential (LFP) time series trials recorded over the course of an associative learning experiment with a male macaque. The goal of the analysis is to study evolving spectral characteristics of the time series trials over the course of the experiment and to this end we perform a fuzzy cluster analysis of the trial-specific spectra via pdSpecClust()
.
Chau, J., and R. von Sachs. 2017. “Positive Definite Multivariate Spectral Estimation: A Geometric Wavelet Approach.” http://arxiv.org/abs/1701.03314.