Synthetic Control - Random Notes Go Brrrrrrr

> [!tldr] Synthetic Control > **Synthetic control** is a way of estimating the counterfactual of a treated subject with a "synthetic subject" that is a convex weighted average of control subjects. > > The control subjects form the **donor pool**. ![[SyntheticControlGermany.png#invert|w60|center]] Notation: suppose we have a panel data, where the $i$th entity is measured at time $t$ to be $Y_{it}$, and depending on the treatment $W$ it takes the values $Y_{it}^{(0)}$ and $Y_{it}^{(1)}$. Now WLOG let $Y_{1}$ be the treated subject, where treatment is applied over $[T_{0}, T]$ ($T$ being the end of measurement). We wish to estimate $Y^{(0)}_{1t}$ for $t \ge T_{0}$, the counterfactual after the treatment was applied. A standard idea is to approximate $Y_{1}$ with a linear combination of other subjects, with weights $w$: $Y_{1t} \approx \sum_{i \ne 1}w_{i}Y_{it}~~~~ (t < T_{0}).$With those weights, we then estimate the counterfactual as $\hat{Y}^{(0)}_{1t}=\sum_{i \ne 1}w_{i}Y_{it}~~~~(t \ge T_{0}).$ One might try linear regression of $Y_{1} \sim Y_{2},\dots,Y_{n}$ using data before the treatment (potentially without an intercept), and use the fitted coefficients as $w$. This method suffers from a few drawbacks: - The values of $w$ can be nonsensical: they can be negative or not add up to $1$, hurting interpretability. - When $n$ is large compared to $T_{0}$, this can lead to overfitting or even non-unique solutions, as we are fitting $1$ time series with $n-1$ controls. Synthetic control uses an alternative: it finds the optimal orthogonal projection of $\mathbf{y}_{1}$ onto the **convex hull** of $\{ \mathbf{y}_{2},\dots,\mathbf{y}_{n} \}$ ($\mathbf{y}_{i}$ containing the data up till $T_{0}-1$). - A convex hull requires the weights to add up to $1$ and be non-negative. - It is also sparse, i.e. most controls have a weight of $0$. ### Inference with Permutation Tests One way of making inference with synthetic controls is to iterate over each subject, and pretending that they are the treated subject, and creating a synthetic control from the rest. *If the treatment has a significant effect, the real test subject will stand out* when we examine the graph of $(Y_{it}-\hat{Y}_{it})$ plotted against $t$ for each $i$. - Of course there will be subjects whose fit is bad whether it is before or after $T_{0}$, and we can remove them.