Skip to contents

Implements the doubly robust estimator of Xu (2023) for the direct average treatment effect on the treated (DATT) at a given exposure level g, in the two-period, common-adoption-timing setting.

Usage

did_int_2x2(
  data,
  yname,
  yname_pre,
  treat,
  exposure,
  g,
  covariates,
  coords = NULL,
  cutoff = NULL,
  dist_fn = c("spherical", "euclidean"),
  trim = NULL,
  alpha = 0.05
)

Arguments

data

A data frame.

yname

Character. Column name for the post-period outcome.

yname_pre

Character. Column name for the pre-period outcome.

treat

Character. Column name of the binary treatment indicator W_i (post period; 0/1).

exposure

Character. Column name of the exposure variable G_i (an integer or factor). Effects are computed at exposure level g.

g

The exposure level at which to estimate the DATT.

covariates

Character vector of column names for the attributes z_i used in all five working models.

coords

Optional 2-column matrix or data frame of unit coordinates (e.g. longitude, latitude). When supplied together with cutoff, the standard error is the Conley spatial-HAC of the influence function.

cutoff

Distance cutoff for the Conley kernel, in the same units as coords (km if coords are lon/lat with dist_fn = "spherical").

dist_fn

Either "spherical" (great-circle, expects lon/lat) or "euclidean". Default "spherical".

trim

Optional propensity-score trimming threshold. If supplied, units with p_hat <= trim or p_hat >= 1 - trim are dropped before computing the DR estimate. Matches the trim at 0.01 used by Xu (2026) in the Brazil application. Default NULL (no trimming).

alpha

Significance level for the CI; default 0.05.

Value

A list of class "didint_2x2" with:

estimate

The DR estimate of DATT at exposure g.

se

Standard error (iid by default; Conley if coords supplied).

ci

Two-element numeric vector with lower and upper CI bounds.

n_treated

Number of treated units used.

n_control

Number of control units used.

n_total

Total units satisfying the inclusion mask.

call

The matched call.

Details

Under correctly specified exposure mapping, conditional parallel trends at the chosen exposure level, overlap, and no anticipation, did_int_2x2() returns a consistent estimate of $$\tau_g = E[ y_{i,1}(1, g) - y_{i,1}(0, g) | z_i, W_i = 1, G_i = g ].$$

Three propensity models and two outcome-change models are fit:

  • p(z) = P(W=1 | z) — cohort propensity

  • pi_1g(z) = P(G=g | z, W=1) — exposure prop. among treated

  • pi_0g(z) = P(G=g | z, W=0) — exposure prop. among controls

  • m_1g(z) = E[dY | z, W=1, G=g] — outcome change for treated

  • m_0g(z) = E[dY | z, W=0, G=g] — outcome change for controls

The DR estimator is doubly robust: consistent if either all three propensities OR both outcome models are correctly specified.

Standard errors come from the empirical influence function. For spatial inference, pass coords and cutoff; SEs are then computed via the Conley spatial-HAC variance on the influence-function vector (requires the conleyreg package).

References

Xu, Ruonan (2023). "Difference-in-Differences with Interference." arXiv:2306.12003.

Xu, Ruonan (2026). "Dynamic Difference-in-Differences with Interference." AEA Papers and Proceedings 116: 58–63.

Examples

# Simulate a small 2-period panel with binary direct + spillover effects.
set.seed(1)
N <- 600
lon <- runif(N, 0, 10); lat <- runif(N, 0, 10)
z   <- 0.3 * lon + 0.2 * lat + rnorm(N)
W <- rbinom(N, 1, plogis(-0.5 + 0.6 * z))
dij <- as.matrix(dist(cbind(lon, lat)))
A <- (dij < 1.5) & (dij > 0)
share <- (A %*% W) / pmax(rowSums(A), 1)
G <- as.integer(share > median(share))
Y_pre  <- 0.8 * z + rnorm(N)
Y_post <- Y_pre + 0.2 * z + 1.5 * W + 0.5 * G * W + rnorm(N)
df <- data.frame(W = W, G = G, z = z, Y_pre = Y_pre, Y_post = Y_post,
                 lon = lon, lat = lat)

# Doubly robust direct ATT at high exposure (g = 1); truth is 2.0
res <- did_int_2x2(df, yname = "Y_post", yname_pre = "Y_pre",
                   treat = "W", exposure = "G", g = 1,
                   covariates = "z", trim = 0.01)
print(res)
#> Doubly robust DATT (Xu 2023, 2x2 case)
#>   Exposure level g = 1
#>   N total = 600 (treated 407, control 193), of which 300 at exposure g
#>   DATT     = 1.8917
#>   SE       = 0.1653
#>   95% CI  = [1.5678, 2.2156]