Suppose we wish to study the relationship between two time series $(X_{t})$ and $(Y_{t})$. Common approaches include:
- Treat one series as an input and the other as the output -- this mimics the regression approach in regular ML.
- Otherwise, the two series can be treated equally, and we study their correlation, a generalization of the univariate study of [[The Frequency Domain]]. This note focuses on this approach.
> [!warning] Assumptions
> We assume the series are **time-invariant**, so in particular $\mathbb{E}[X_{t}],\mathbb{E}[Y_{t}], \mathrm{Var}(X_{t}),\mathrm{Var}(Y_{t}),$ and cross-covariances for all lags (defined below) are all constant in $t$.
> [!definition|*] Cross-covariance
> The **cross covariance** (at lag $k$) of the two time series are defined to be:
$\gamma_{XY}(k):= \mathbb{E}[(X_{t}-\mu_{X})(Y_{t+k}-\mu_{Y})],$where the function is independent of $t$ because the series are assumed to be stationary.
>
> The **cross correlation** is just $\rho_{XY}(k):= \gamma_{XY}(k) / \sigma_{X}\sigma_{Y}$.
^6a4f93
- Note that the function is not even, unlike the autocovariance function.
The naive estimator of cross covariance/correlation is:
$\begin{align*}
\hat{\gamma}_{XY}(k)&= \frac{1}{N-k}\sum_{t=1}^{N-k}(X_{t}-\bar{X})(Y_{t+k}-\bar{Y}),\\
\hat{\rho}_{XY}(k)&= \hat{\gamma}_{XY}(k) / s_{X}s_{Y},
\end{align*}$
where $\bar{X},\bar{Y}$ are the sample means and $s_{X},s_{Y}$ standard errors of the de-trended time series. However, its issues include:
- For neighboring choices of $k$, the estimates are correlated.
- It is easily inflated by the autocovariance within each series, so it often gives spurious correlations.
To remedy the issue, we use the following procedure:
- [[Prewhitening of Time Series|Prewhiten]] the time series to make at least one of them (say $X$) close to white noise (so there is minimal autocovariance), giving a series $\tilde{X}$.
- To keep the cross covariance functions and linear structures the same, we apply the same transformation to $Y$ to get $\tilde{Y}$.
- Study the cross covariance between the transformed series $\tilde{X}, \tilde{Y}$. The same conclusions should hold before the transformations too.
From that, we can test for correlation between the two, as two uncorrelated series $X,Y$, one of which being a white noise (WLOG being $X$), has $\begin{align*}
\mathbb{E}[\hat{\rho}_{XY}(k)]&=\frac{1}{N-k}\sum_{t=1}^{N-k}\underbrace{\mathbb{E}[(X_{t}-\bar{X})]}_{=0}\cdot \mathbb{E}[(Y_{t}-\bar{Y})]= 0,\\
\mathrm{Var}(\hat{\rho}_{XY}(k))&\approx \frac{1}{N}.
\end{align*}$Therefore, a rule of thumb is that a correlation larger than $2 / \sqrt{ N }$ is significant.
## The Cross-Spectrum
Similar to one-variate [[Spectral Distribution Function|spectral density function]], the cross-covariance defines the following:
> [!definition|*] The Cross-Spectrum
> The **cross-spectrum** $f_{XY}$ is the [[Discrete-Time Fourier Transforms|discrete-time FT]] of the cross spectrum (up to a scalar multiple): $f_{XY}(\omega):= \frac{1}{2\pi}\sum_{k=-\infty}^{\infty}\gamma_{XY}(k)\exp(-i\omega k),$defined for $\omega \in (-\pi, \pi)$. As do Fourier transforms, the cross-covariance can be recovered with $\gamma_{XY}(k) = \int _{-\pi}^{\pi}e^{i\omega k} f_{XY}(\omega) ~d\omega$
^5c77e3
### Decomposition of the Cross Spectrum
As a function on ($\mathbb{R} \to \mathbb{C}$), the cross spectrum can be decomposed into real and imaginary parts:
> [!definition|*] Co- and Quadrature Spectrum
> The **co-spectrum** and **quadrature spectrum** of $f_{XY}$ is its real and (the negative of) imaginary parts: $\begin{align*}
c(\omega)&:= \mathrm{Re}(f_{XY}(\omega))=\frac{1}{2\pi}\sum_{k=-\infty}^{\infty}\gamma_{XY}(k) \cos(\omega k),\\
q(\omega)& :=-\mathrm{Im}(f_{XY}(\omega )) =\frac{1}{2\pi}\sum_{k=-\infty}^{\infty}\gamma_{XY}(k)\sin(\omega k).
\end{align*}$Therefore, $f_{XY}=c-iq$.
Another decomposition into the magnitude and argument gives
> [!definition|*] Cross-Amplitude and Phase Spectrum
> The **cross-amplitude and phase spectra** $\alpha_{XY}(\omega), \phi(\omega)$ are the magnitude and argument of $f_{XY}$ respectively: $\begin{align*}
\alpha_{XY}(\omega) &:=| f_{XY}(\omega) |=\sqrt{ c(\omega)^{2}+q(\omega)^{2} },\\
\phi(\omega)&:=\mathrm{arg}~f_{XY}(\omega)=-\arctan q(\omega) / c(\omega)
\end{align*}$
- $\alpha_{XY}(\omega)$ measures the total amount of covariance between the two series at frequency $\omega$.
Lastly, from a linear modeling perspective, we have the following decomposition:
> [!definition|*] Coherency and Gain Spectrum
> The **coherency** $C(\omega)$ is given by $C(\omega):= \alpha_{XY}^{2} / f_{X}f_{Y}=\frac{c^{2}+q^{2}}{f_{X}f_{Y}}.$where $f_{X},f_{Y}$ are the [[Spectral Distribution Function|spectral density functions]]. It is essentially the linear correlation coefficient between the time series (compare with $\rho=\mathrm{Cov}(X,Y) / \sigma_{X}\sigma_{Y}$ for regular RVs).
>
> The **gain spectrum** $G_{XY}(\omega)$ is given by $G_{XY}(\omega):= \alpha_{XY}(\omega) / f_{X},$which corresponds to the univariate linear [](Spectral%20Distribution%20Function.md)Penn State Lecture Notes: provides examples of cross variance plots.
<iframe width = 100% height = 120 src="https://online.stat.psu.edu/stat510/lesson/9/9.1"></iframe>