Suppose we wish to study the relationship between two time series $(X_{t})$ and $(Y_{t})$. Common approaches include: - Treat one series as an input and the other as the output -- this mimics the regression approach in regular ML. - Otherwise, the two series can be treated equally, and we study their correlation, a generalization of the univariate study of [[The Frequency Domain]]. This note focuses on this approach. > [!warning] Assumptions > We assume the series are **time-invariant**, so in particular $\mathbb{E}[X_{t}],\mathbb{E}[Y_{t}], \mathrm{Var}(X_{t}),\mathrm{Var}(Y_{t}),$ and cross-covariances for all lags (defined below) are all constant in $t$. > [!definition|*] Cross-covariance > The **cross covariance** (at lag $k$) of the two time series are defined to be: $\gamma_{XY}(k):= \mathbb{E}[(X_{t}-\mu_{X})(Y_{t+k}-\mu_{Y})],$where the function is independent of $t$ because the series are assumed to be stationary. > > The **cross correlation** is just $\rho_{XY}(k):= \gamma_{XY}(k) / \sigma_{X}\sigma_{Y}$. ^6a4f93 - Note that the function is not even, unlike the autocovariance function. The naive estimator of cross covariance/correlation is: $\begin{align*} \hat{\gamma}_{XY}(k)&= \frac{1}{N-k}\sum_{t=1}^{N-k}(X_{t}-\bar{X})(Y_{t+k}-\bar{Y}),\\ \hat{\rho}_{XY}(k)&= \hat{\gamma}_{XY}(k) / s_{X}s_{Y}, \end{align*}$ where $\bar{X},\bar{Y}$ are the sample means and $s_{X},s_{Y}$ standard errors of the de-trended time series. However, its issues include: - For neighboring choices of $k$, the estimates are correlated. - It is easily inflated by the autocovariance within each series, so it often gives spurious correlations. To remedy the issue, we use the following procedure: - [[Prewhitening of Time Series|Prewhiten]] the time series to make at least one of them (say $X$) close to white noise (so there is minimal autocovariance), giving a series $\tilde{X}$. - To keep the cross covariance functions and linear structures the same, we apply the same transformation to $Y$ to get $\tilde{Y}$. - Study the cross covariance between the transformed series $\tilde{X}, \tilde{Y}$. The same conclusions should hold before the transformations too. From that, we can test for correlation between the two, as two uncorrelated series $X,Y$, one of which being a white noise (WLOG being $X$), has $\begin{align*} \mathbb{E}[\hat{\rho}_{XY}(k)]&=\frac{1}{N-k}\sum_{t=1}^{N-k}\underbrace{\mathbb{E}[(X_{t}-\bar{X})]}_{=0}\cdot \mathbb{E}[(Y_{t}-\bar{Y})]= 0,\\ \mathrm{Var}(\hat{\rho}_{XY}(k))&\approx \frac{1}{N}. \end{align*}$Therefore, a rule of thumb is that a correlation larger than $2 / \sqrt{ N }$ is significant. ## The Cross-Spectrum Similar to one-variate [[Spectral Distribution Function|spectral density function]], the cross-covariance defines the following: > [!definition|*] The Cross-Spectrum > The **cross-spectrum** $f_{XY}$ is the [[Discrete-Time Fourier Transforms|discrete-time FT]] of the cross spectrum (up to a scalar multiple): $f_{XY}(\omega):= \frac{1}{2\pi}\sum_{k=-\infty}^{\infty}\gamma_{XY}(k)\exp(-i\omega k),$defined for $\omega \in (-\pi, \pi)$. As do Fourier transforms, the cross-covariance can be recovered with $\gamma_{XY}(k) = \int _{-\pi}^{\pi}e^{i\omega k} f_{XY}(\omega) ~d\omega$ ^5c77e3 ### Decomposition of the Cross Spectrum As a function on ($\mathbb{R} \to \mathbb{C}$), the cross spectrum can be decomposed into real and imaginary parts: > [!definition|*] Co- and Quadrature Spectrum > The **co-spectrum** and **quadrature spectrum** of $f_{XY}$ is its real and (the negative of) imaginary parts: $\begin{align*} c(\omega)&:= \mathrm{Re}(f_{XY}(\omega))=\frac{1}{2\pi}\sum_{k=-\infty}^{\infty}\gamma_{XY}(k) \cos(\omega k),\\ q(\omega)& :=-\mathrm{Im}(f_{XY}(\omega )) =\frac{1}{2\pi}\sum_{k=-\infty}^{\infty}\gamma_{XY}(k)\sin(\omega k). \end{align*}$Therefore, $f_{XY}=c-iq$. Another decomposition into the magnitude and argument gives > [!definition|*] Cross-Amplitude and Phase Spectrum > The **cross-amplitude and phase spectra** $\alpha_{XY}(\omega), \phi(\omega)$ are the magnitude and argument of $f_{XY}$ respectively: $\begin{align*} \alpha_{XY}(\omega) &:=| f_{XY}(\omega) |=\sqrt{ c(\omega)^{2}+q(\omega)^{2} },\\ \phi(\omega)&:=\mathrm{arg}~f_{XY}(\omega)=-\arctan q(\omega) / c(\omega) \end{align*}$ - $\alpha_{XY}(\omega)$ measures the total amount of covariance between the two series at frequency $\omega$. Lastly, from a linear modeling perspective, we have the following decomposition: > [!definition|*] Coherency and Gain Spectrum > The **coherency** $C(\omega)$ is given by $C(\omega):= \alpha_{XY}^{2} / f_{X}f_{Y}=\frac{c^{2}+q^{2}}{f_{X}f_{Y}}.$where $f_{X},f_{Y}$ are the [[Spectral Distribution Function|spectral density functions]]. It is essentially the linear correlation coefficient between the time series (compare with $\rho=\mathrm{Cov}(X,Y) / \sigma_{X}\sigma_{Y}$ for regular RVs). > > The **gain spectrum** $G_{XY}(\omega)$ is given by $G_{XY}(\omega):= \alpha_{XY}(\omega) / f_{X},$which corresponds to the univariate linear [](Spectral%20Distribution%20Function.md)Penn State Lecture Notes: provides examples of cross variance plots. <iframe width = 100% height = 120 src="https://online.stat.psu.edu/stat510/lesson/9/9.1"></iframe>