19 Same Data, Different Estimators: A Synthetic Control Comparison

Published

May 15, 2026

19.1 One panel, four questions

The cleanest way to understand what a panel estimator actually does is to hold the data fixed and change only the estimator. If the estimates agree, the choice of method is a minor implementation detail. If they diverge, each gap is an empirical lesson about what the estimator assumed.

California’s Proposition 99 (1988) is the canonical synthetic control dataset: 39 US states, annual cigarette sales per capita from 1970 to 2000, treatment beginning in 1989. We apply four estimators to exactly the same outcome matrix — DiD, Synthetic DiD, Synthetic Control, and TASC — and ask what drives the differences.

19.2 Four estimators, one panel

using CSV, DataFrames, Statistics
using SynthDiD
using TASC

df    = CSV.read("california_prop99.csv", DataFrame)
# ... pivot to N×T matrix Y, N0 = 38 donors, T0 = 19 pre-treatment years
# (full setup code below)

tau_did  = did_estimate(Y, N0, T0)
tau_sdid = synthdid_estimate(Y, N0, T0)
tau_sc   = sc_estimate(Y, N0, T0)

# TASC: treated unit must be row 1
Y_tasc = vcat(Y[(N0+1):end, :], Y[1:N0, :])
model  = fit_tasc(Y_tasc; d=2, T0=T0, n_em=200, tol=1e-3)
pred   = predict_counterfactual(model, Y_tasc)

Estimator	ATT (packs per capita)
Difference-in-differences	−27.4
Synthetic control	−19.6
Synthetic DiD	−16.1
TASC	−17.1

Ten packs separates the smallest and largest estimate. That is not rounding error. It is a direct consequence of what each estimator treats as a valid control comparison.

19.3 What changes across estimators

DiD (−27.4) assigns equal weight to every control state and every pre-treatment year. California is compared to the average of all 38 donor states. As the figure shows, the equal-weighted control composite sits far below California’s pre-treatment trajectory, so the implied parallel-trends counterfactual is too low — it predicts California would have fallen much less without Prop 99 than a closer match would suggest. The large ATT reflects the poor pre-treatment fit of equal weights, not a larger causal effect.

Synthetic control (−19.6) replaces equal weights with optimised donor weights: a convex combination of states that best matches California’s 1970–1988 cigarette sales. The pre-period fit is tight. Once the control composite closely tracks California before treatment, the post-treatment gap shrinks to a more credible estimate. The 8-point drop from DiD to SC ($27 → $20) is entirely attributable to better pre-treatment match — same data, better control group.

Synthetic DiD (−16.1) adds time weights on top of unit weights. The SDiD objective up-weights pre-treatment years that most resemble the recent pre-treatment trend, reducing the influence of the distant 1970s where the data can only loosely inform the post-1988 counterfactual. The small additional drop from SC to SDiD ($20 → $16) reflects those time weights discounting the early pre-period.

TASC (−17.1) takes a different route entirely. Instead of choosing weights, it fits a state-space model to the pre-treatment panel:

\[ x_t = A\,x_{t-1} + q_t,\quad q_t \sim \mathcal{N}(0,Q) \]

\[ y_t = H\,x_t + r_t,\quad r_t \sim \mathcal{N}(0,R) \]

The matrix $A$ captures how the latent factors evolve over time — the piece that DiD, SC, and SDiD all ignore. After treatment, the Kalman filter continues to run on the donor panel to infer the latent state; the treated unit’s row of $H$ then maps that state to a counterfactual. The TASC estimate (−17.1) lands between SC and SDiD, consistent with its low-rank factor model being a different but comparably-principled shrinkage of the donor pool.

19.4 The figure

# Counterfactual path for each method
cfact_did  = omega_did'  * Y[1:N0, :] .+ intercept_did   # parallel trends
cfact_sdid = omega_sdid' * Y[1:N0, :] .+ intercept_sdid
cfact_sc   = omega_sc'   * Y[1:N0, :]                     # no intercept
cfact_tasc = vec(pred.target)
tasc_se    = sqrt.(max.(vec(pred.variance), 0.0))

Four counterfactual paths for California. Dashed lines (DiD, SDiD) use pre-period intercepts to enforce parallel trends; solid lines (SC, TASC) fit levels directly. The shaded band is the TASC 95% posterior interval $\hat Y_{1t}(0) \pm 1.96\,\hat\sigma_t$.

Two things are visible in the figure that the table cannot show.

First, the pre-treatment paths diverge. DiD’s equal-weighted control composite (dashed red) sits well below California in the pre-period. SC and TASC track California closely — that is what the weight optimisation buys. The DiD counterfactual’s poor pre-treatment fit is the whole explanation for its larger post-treatment gap.

Second, the TASC posterior band is narrow in the pre-period and widens after 1989. This is the right shape. Before treatment, the RTS smoother borrows information from both directions in time and from all donor units, so latent state uncertainty is low. After treatment, the filter runs forward without California’s observations; each additional post-treatment year compounds one more step of state uncertainty via $P_{t+1} = AP_tA^\top + Q$. The other three estimators give a single point estimate for the entire post-treatment path — not because the future is more predictable than the past, but because they have no time-series model to tell them otherwise.

19.5 What the comparison teaches

Comparison	What it isolates
DiD vs SC	Unit weights — equal vs optimised donor match
SC vs SDiD	Time weights — equal vs recency-weighted pre-period
SC vs TASC	Model — weight fitting vs latent state-space
SC/SDiD vs TASC	Uncertainty — point estimate vs posterior interval

The right estimator depends on the setting. DiD is transparent and requires only parallel trends. SC earns tighter pre-treatment fit at the cost of assuming a convex donor combination. SDiD adds time-period discipline. TASC is appropriate when the outcome series has persistent dynamics and genuine post-treatment uncertainty quantification matters.

For this particular panel — 19 pre-treatment years, stable trends, 11 post-treatment years — the estimates cluster between −16 and −20 once pre-treatment match is enforced (SC, SDiD, TASC). The outlier is DiD, whose equal weights are rejected by the data. The lesson is not that Prop 99 had a different effect depending on who you ask, but that equal weights are a poor description of California’s counterfactual.

19.6 Setup code

df     = CSV.read("california_prop99.csv", DataFrame)
states = sort(unique(df.State))
years  = sort(unique(df.Year))
N, T   = length(states), length(years)

Y_wide      = zeros(N, T)
treated_vec = zeros(Bool, N)
for row in eachrow(df)
    i = findfirst(==(row.State), states)
    t = findfirst(==(row.Year), years)
    Y_wide[i, t] = row.PacksPerCapita
    treated_vec[i] |= (row.treated == 1)
end

ctrl_idx = findall(.!treated_vec)
trt_idx  = findall(treated_vec)
Y  = Y_wide[vcat(ctrl_idx, trt_idx), :]   # donors first
N0 = length(ctrl_idx)                      # 38
T0 = findfirst(==(1989), years) - 1        # 19

The SynthDiD.jl package provides did_estimate, synthdid_estimate, and sc_estimate. The TASC.jl package provides fit_tasc and predict_counterfactual. Both packages are available on GitHub.

--- title: "Same Data, Different Estimators: A Synthetic Control Comparison" date: "2026-05-15" --- ## One panel, four questions The cleanest way to understand what a panel estimator actually does is to hold the data fixed and change only the estimator. If the estimates agree, the choice of method is a minor implementation detail. If they diverge, each gap is an empirical lesson about what the estimator assumed. California's Proposition 99 (1988) is the canonical synthetic control dataset: 39 US states, annual cigarette sales per capita from 1970 to 2000, treatment beginning in 1989. We apply four estimators to exactly the same outcome matrix — DiD, Synthetic DiD, Synthetic Control, and TASC — and ask what drives the differences. ## Four estimators, one panel ```{julia} #| eval: false #| echo: true using CSV, DataFrames, Statistics using SynthDiD using TASC df = CSV.read("california_prop99.csv", DataFrame) # ... pivot to N×T matrix Y, N0 = 38 donors, T0 = 19 pre-treatment years # (full setup code below) tau_did = did_estimate(Y, N0, T0) tau_sdid = synthdid_estimate(Y, N0, T0) tau_sc = sc_estimate(Y, N0, T0) # TASC: treated unit must be row 1 Y_tasc = vcat(Y[(N0+1):end, :], Y[1:N0, :]) model = fit_tasc(Y_tasc; d=2, T0=T0, n_em=200, tol=1e-3) pred = predict_counterfactual(model, Y_tasc) ``` | Estimator | ATT (packs per capita) | |---|---:| | Difference-in-differences | −27.4 | | Synthetic control | −19.6 | | Synthetic DiD | −16.1 | | TASC | −17.1 | Ten packs separates the smallest and largest estimate. That is not rounding error. It is a direct consequence of what each estimator treats as a valid control comparison. ## What changes across estimators **DiD (−27.4)** assigns equal weight to every control state and every pre-treatment year. California is compared to the average of all 38 donor states. As the figure shows, the equal-weighted control composite sits far below California's pre-treatment trajectory, so the implied parallel-trends counterfactual is too low — it predicts California would have fallen much less without Prop 99 than a closer match would suggest. The large ATT reflects the poor pre-treatment fit of equal weights, not a larger causal effect. **Synthetic control (−19.6)** replaces equal weights with optimised donor weights: a convex combination of states that best matches California's 1970–1988 cigarette sales. The pre-period fit is tight. Once the control composite closely tracks California before treatment, the post-treatment gap shrinks to a more credible estimate. The 8-point drop from DiD to SC ($27 → $20) is entirely attributable to better pre-treatment match — same data, better control group. **Synthetic DiD (−16.1)** adds time weights on top of unit weights. The SDiD objective up-weights pre-treatment years that most resemble the recent pre-treatment trend, reducing the influence of the distant 1970s where the data can only loosely inform the post-1988 counterfactual. The small additional drop from SC to SDiD ($20 → $16) reflects those time weights discounting the early pre-period. **TASC (−17.1)** takes a different route entirely. Instead of choosing weights, it fits a state-space model to the pre-treatment panel: $$ x_t = A\,x_{t-1} + q_t,\quad q_t \sim \mathcal{N}(0,Q) $$ $$ y_t = H\,x_t + r_t,\quad r_t \sim \mathcal{N}(0,R) $$ The matrix $A$ captures how the latent factors evolve over time — the piece that DiD, SC, and SDiD all ignore. After treatment, the Kalman filter continues to run on the donor panel to infer the latent state; the treated unit's row of $H$ then maps that state to a counterfactual. The TASC estimate (−17.1) lands between SC and SDiD, consistent with its low-rank factor model being a different but comparably-principled shrinkage of the donor pool. ## The figure ```{julia} #| eval: false #| echo: true # Counterfactual path for each method cfact_did = omega_did' * Y[1:N0, :] .+ intercept_did # parallel trends cfact_sdid = omega_sdid' * Y[1:N0, :] .+ intercept_sdid cfact_sc = omega_sc' * Y[1:N0, :] # no intercept cfact_tasc = vec(pred.target) tasc_se = sqrt.(max.(vec(pred.variance), 0.0)) ``` ![Four counterfactual paths for California. Dashed lines (DiD, SDiD) use pre-period intercepts to enforce parallel trends; solid lines (SC, TASC) fit levels directly. The shaded band is the TASC 95% posterior interval $\hat Y_{1t}(0) \pm 1.96\,\hat\sigma_t$.](tasc-prop99.svg) Two things are visible in the figure that the table cannot show. First, the **pre-treatment paths** diverge. DiD's equal-weighted control composite (dashed red) sits well below California in the pre-period. SC and TASC track California closely — that is what the weight optimisation buys. The DiD counterfactual's poor pre-treatment fit is the whole explanation for its larger post-treatment gap. Second, the **TASC posterior band** is narrow in the pre-period and widens after 1989. This is the right shape. Before treatment, the RTS smoother borrows information from both directions in time and from all donor units, so latent state uncertainty is low. After treatment, the filter runs forward without California's observations; each additional post-treatment year compounds one more step of state uncertainty via $P_{t+1} = AP_tA^\top + Q$. The other three estimators give a single point estimate for the entire post-treatment path — not because the future is more predictable than the past, but because they have no time-series model to tell them otherwise. ## What the comparison teaches | Comparison | What it isolates | |---|---| | DiD vs SC | Unit weights — equal vs optimised donor match | | SC vs SDiD | Time weights — equal vs recency-weighted pre-period | | SC vs TASC | Model — weight fitting vs latent state-space | | SC/SDiD vs TASC | Uncertainty — point estimate vs posterior interval | The right estimator depends on the setting. DiD is transparent and requires only parallel trends. SC earns tighter pre-treatment fit at the cost of assuming a convex donor combination. SDiD adds time-period discipline. TASC is appropriate when the outcome series has persistent dynamics and genuine post-treatment uncertainty quantification matters. For this particular panel — 19 pre-treatment years, stable trends, 11 post-treatment years — the estimates cluster between −16 and −20 once pre-treatment match is enforced (SC, SDiD, TASC). The outlier is DiD, whose equal weights are rejected by the data. The lesson is not that Prop 99 had a different effect depending on who you ask, but that equal weights are a poor description of California's counterfactual. ## Setup code ```{julia} #| eval: false #| echo: true df = CSV.read("california_prop99.csv", DataFrame) states = sort(unique(df.State)) years = sort(unique(df.Year)) N, T = length(states), length(years) Y_wide = zeros(N, T) treated_vec = zeros(Bool, N) for row in eachrow(df) i = findfirst(==(row.State), states) t = findfirst(==(row.Year), years) Y_wide[i, t] = row.PacksPerCapita treated_vec[i] |= (row.treated == 1) end ctrl_idx = findall(.!treated_vec) trt_idx = findall(treated_vec) Y = Y_wide[vcat(ctrl_idx, trt_idx), :] # donors first N0 = length(ctrl_idx) # 38 T0 = findfirst(==(1989), years) - 1 # 19 ``` The `SynthDiD.jl` package provides `did_estimate`, `synthdid_estimate`, and `sc_estimate`. The `TASC.jl` package provides `fit_tasc` and `predict_counterfactual`. Both packages are available on GitHub.