10  Continuous and Multivalued Treatments

library(tidyverse)
library(WeightIt)
library(causaldrf)
library(lmtp)
library(SuperLearner)

The previous chapters treated \(D\) as binary — a single yes/no decision. Real treatments are often continuous (dose of a medication, dollars spent, hours of exposure) or multivalued (one of several treatment arms). For these, we cannot speak of “the effect of treatment” as a single scalar; instead we estimate the entire dose-response curve

\[ \mu(d) = \mathbb{E}[Y(d)], \quad d \in \mathcal{D}. \]

This chapter covers three approaches:

  1. Generalized propensity score (GPS; Hirano-Imbens 2004) — extends propensity score methods to continuous treatments via the density-of-treatment-given-covariates.
  2. Doubly-robust dose-response estimation (Kennedy 2017) — combines GPS with an outcome model for efficiency and robustness.
  3. Longitudinal modified treatment policies (LMTP; Diaz-Hejazi 2020) — a unified framework for continuous, multivalued, and time-varying treatments, with native handling of positivity violations.

Related reading: The LMTP material below is condensed from the longer treatment in Longitudinal modified treatment policy (LMTP) in Topics on Econometrics and Causal Inference, which follows Nicholas Williams’s tutorials at beyondtheate.com.

10.1 The continuous-treatment setup

A simulation makes the conceptual issue clear. Suppose treatment is a dose \(D \in [0, 10]\) assigned via a generalized propensity score that depends on covariates:

set.seed(42)
n  <- 2000
X1 <- rnorm(n)
X2 <- rnorm(n)
# Continuous treatment, depends on X1, X2
D  <- 2 + 0.5 * X1 + 0.5 * X2 + rnorm(n, sd = 1.0)
# Dose-response: nonlinear in D
true_mu <- function(d) 1.0 + 2.0 * d - 0.15 * d^2
Y  <- true_mu(D) + 0.3 * X1 - 0.3 * X2 + rnorm(n)

df <- tibble(Y = Y, D = D, X1 = X1, X2 = X2)
cat(sprintf("Treatment range: [%.2f, %.2f]\n", min(D), max(D)))
Treatment range: [-1.82, 6.39]
cat(sprintf("True μ(d=2) = %.2f, μ(d=4) = %.2f, μ(d=6) = %.2f\n",
            true_mu(2), true_mu(4), true_mu(6)))
True μ(d=2) = 4.40, μ(d=4) = 6.60, μ(d=6) = 7.60

The dose-response curve \(\mu(d) = 1 + 2d - 0.15 d^2\) is hump-shaped: it increases at low doses and decreases at high doses. The challenge is to recover this shape from confounded observational data.

10.2 Naive regression vs adjusted regression

A linear regression of \(Y\) on \(D\) alone is doubly biased: confounding through \(X\) plus misspecification of the nonlinearity:

# Linear fit, ignoring covariates
linear_naive <- lm(Y ~ D, data = df)
# Linear fit with covariate adjustment
linear_adj   <- lm(Y ~ D + X1 + X2, data = df)
# Nonlinear fit with covariate adjustment
poly_adj     <- lm(Y ~ poly(D, 2) + X1 + X2, data = df)

ds <- seq(0, 10, by = 0.5)

predict_mu <- function(fit, ds) {
  data.frame(d = ds, X1 = 0, X2 = 0) |>
    rename(D = d) -> nd
  predict(fit, newdata = nd)
}

# True dose-response
df_true <- tibble(d = ds, mu = true_mu(ds))

df_fits <- bind_rows(
  tibble(d = ds, mu = predict_mu(linear_naive, ds), method = "Linear (naive)"),
  tibble(d = ds, mu = predict_mu(linear_adj,   ds), method = "Linear (adj X)"),
  tibble(d = ds, mu = predict_mu(poly_adj,     ds), method = "Quadratic (adj X)")
)
ggplot() +
  geom_line(data = df_fits, aes(d, mu, colour = method), linewidth = 1.0) +
  geom_line(data = df_true, aes(d, mu), colour = "black",
            linetype = "dashed", linewidth = 1.0) +
  labs(x = "Dose D", y = "Expected outcome E[Y(d)]",
       colour = "Method",
       caption = "Black dashed line = true μ(d) = 1 + 2d - 0.15 d²") +
  theme_minimal()

Fitted vs true dose-response. Linear fits miss the curvature; the quadratic-adjusted fit comes closer.

Even the quadratic adjustment leaves bias if the true dose-response is not quadratic or if the covariate adjustment is misspecified.

10.3 Generalized propensity score (GPS)

The generalized propensity score is the conditional density of the treatment given covariates:

\[ r(d, x) = f_{D \mid X}(d \mid x). \]

Under unconfoundedness, conditioning on \(r(D, X)\) alone is sufficient to identify the dose-response curve (Hirano-Imbens 2004). In practice the GPS is estimated assuming, say, \(D \mid X \sim \mathcal{N}(\beta_0 + \beta'X, \sigma^2)\) or by quantile binning.

# Fit GPS model: D ~ X with Normal residuals
gps_model <- lm(D ~ X1 + X2, data = df)
d_pred    <- predict(gps_model)
sigma_e   <- sigma(gps_model)

# GPS for each unit's observed treatment
gps <- dnorm(df$D, mean = d_pred, sd = sigma_e)
df$gps <- gps

# IPW-style weights for continuous treatment: 1 / GPS
# Stabilise by multiplying by the marginal density f_D(D)
marginal_D <- density(df$D)
fD <- approx(marginal_D$x, marginal_D$y, xout = df$D)$y
df$sw_cont <- pmin(fD / gps, quantile(fD / gps, 0.99))

cat(sprintf("Weight summary: min=%.3f, mean=%.3f, max=%.3f\n",
            min(df$sw_cont), mean(df$sw_cont), max(df$sw_cont)))
Weight summary: min=0.039, mean=0.947, max=4.390

The Hirano-Imbens estimator regresses \(Y\) on a flexible function of \((D, \hat r(D, X))\) and then averages across the GPS distribution at each fixed \(D\):

# Stage 2: regress Y on flexible function of (D, GPS)
hi_fit <- lm(Y ~ poly(D, 2) + poly(gps, 2) + I(D * gps), data = df)

# Dose-response at each d: average over the empirical distribution of GPS
# at that d.
estimate_dose <- function(d) {
  # Predict GPS at this d for each unit
  gps_at_d <- dnorm(d, mean = d_pred, sd = sigma_e)
  nd <- data.frame(D = d, gps = gps_at_d)
  mean(predict(hi_fit, newdata = nd))
}

ds <- seq(0, 10, by = 0.5)
hi_dose <- sapply(ds, estimate_dose)

df_hi <- tibble(d = ds, mu = hi_dose, method = "Hirano-Imbens GPS")

ggplot() +
  geom_line(data = df_hi, aes(d, mu, colour = method), linewidth = 1.2) +
  geom_line(data = df_true, aes(d, mu), colour = "black",
            linetype = "dashed", linewidth = 1.0) +
  labs(x = "Dose D", y = "E[Y(d)]",
       caption = "Dashed black = truth; coloured = Hirano-Imbens GPS estimator") +
  theme_minimal()

The Hirano-Imbens estimator recovers the curvature missing from naive regression because it does not impose a parametric form on the dose-response curve.

10.4 Doubly-robust dose-response (Kennedy 2017)

Kennedy (2017) extended AIPW to continuous treatments. The estimator combines:

  • An outcome model \(\hat\mu(d, x) = \mathbb{E}[Y \mid D = d, X = x]\)
  • A generalised propensity score \(\hat r(d \mid x)\)

Then the dose-response at \(d\) is estimated by

\[ \hat\mu_{DR}(d) = \frac{1}{n} \sum_i \left[ \hat\mu(d, X_i) + K_h(D_i - d) \cdot \frac{Y_i - \hat\mu(D_i, X_i)}{\hat r(D_i \mid X_i)} \right], \]

where \(K_h\) is a kernel that smooths over a bandwidth \(h\) near \(d\). The result is doubly robust and asymptotically normal.

The implementation is a few lines using a kernel for smoothing near the target dose \(d\) and the GPS already estimated above:

# Outcome model μ̂(D, X): flexible polynomial
mu_fit <- lm(Y ~ D + I(D^2) + X1 + X2 + I(D * X1) + I(D * X2), data = df)
df$mu_at_obs <- predict(mu_fit)

# DR estimator at dose d with bandwidth h
dr_estimate <- function(d, h = 0.5) {
  # Counterfactual prediction at D = d for each unit
  mu_at_d <- predict(mu_fit, newdata = transform(df, D = d))
  # Kernel weight on observed treatment near d
  kernel <- dnorm(df$D - d, sd = h)
  weight <- kernel / df$gps
  weight <- weight / mean(weight)  # normalise
  # DR correction
  correction <- weight * (df$Y - df$mu_at_obs)
  mean(mu_at_d) + mean(correction)
}

ds <- seq(0, 10, by = 0.5)
dr_dose <- sapply(ds, dr_estimate)

df_drf <- tibble(d = ds, mu = dr_dose, method = "DR (Kennedy 2017)")

ggplot() +
  geom_line(data = df_hi,  aes(d, mu, colour = method), linewidth = 1.0) +
  geom_line(data = df_drf, aes(d, mu, colour = method), linewidth = 1.0) +
  geom_line(data = df_true, aes(d, mu), colour = "black",
            linetype = "dashed", linewidth = 1.0) +
  labs(x = "Dose D", y = "E[Y(d)]",
       caption = "Black dashed = truth") +
  theme_minimal()

The DR estimator is consistent if either the outcome model \(\hat\mu(d, x)\) or the GPS \(\hat r(d \mid x)\) is correctly specified, and efficient when both are. The kernel bandwidth \(h\) controls the bias-variance trade-off near each evaluation dose.

The causaldrf package provides several related estimators (hi_est, iptw_est, aipwee_est, gam_est, nw_est, bart_est) with pre-packaged inference and standard errors — useful for production work once the manual version above clarifies the underlying logic.

10.5 Longitudinal Modified Treatment Policies (LMTP)

For continuous, multivalued, and time-varying treatments — and especially when positivity is a concern (e.g. the dose-response at \(d = 10\) might have zero observations to learn from) — the LMTP framework (Diaz-Hejazi 2020) is the modern workhorse.

LMTP replaces hypothetical interventions like “everyone gets \(D = 4\)” with modified treatment policies like “shift everyone’s \(D\) down by 1 unit.” The shifted-treatment distribution stays close to the observed data, preserving positivity.

10.5.1 A two-period example

set.seed(1)
N    <- 1500
L_1  <- runif(N, 0, 1)
A_1  <- rbinom(N, 1, 0.5)
L_2  <- rnorm(N, 0.25 * L_1, 0.5)
A_2  <- rbinom(N, 1, plogis(0.5 - 0.2 * A_1 + 0.1 * L_2))
Y    <- rnorm(N, A_2 + L_2, 0.3)

lmtp_data <- data.frame(L_1, A_1, L_2, A_2, Y)
head(lmtp_data)
        L_1 A_1        L_2 A_2           Y
1 0.2655087   0  0.4913989   0 -0.42997401
2 0.3721239   1 -0.3696255   0 -0.76396535
3 0.5728534   0  0.5900039   0  0.79312079
4 0.9082078   1 -0.2434529   1  1.23628221
5 0.2016819   0  0.3198965   0  0.02620015
6 0.8983897   0  0.1336102   1  1.00441306

A modified treatment policy might shift treatments downward. Define the shift function:

shift_zero <- function(data, trt) {
  # Set treatment to 0 everywhere — for comparison to "never treat"
  a <- data[[trt]]
  rep(0, length(a))
}

Then fit the LMTP estimator with lmtp::lmtp_tmle:

# TMLE-based LMTP estimator with super-learners for nuisance functions
# (using SL.glm to keep it fast; in practice use richer SL libraries)
lmtp_fit <- lmtp_tmle(
  data       = lmtp_data,
  trt        = c("A_1", "A_2"),
  outcome    = "Y",
  baseline   = NULL,
  time_vary  = list(c("L_1"), c("L_2")),
  shift      = shift_zero,
  outcome_type = "continuous",
  learners_outcome = c("SL.glm"),
  learners_trt     = c("SL.glm"),
  folds = 2
)

lmtp_fit

::: {.cell-output .cell-output-stdout}

      Estimate: 0.118
    Std. error: 0.025

:::

The lmtp package handles the bookkeeping for time-varying covariates and treatments transparently. Its output includes the TMLE point estimate, IPW estimate, and bootstrap-style confidence intervals.

10.5.2 Contrasts between policies

LMTP shines when comparing two policies (e.g. “set to 0” vs “set to 1”):

shift_one <- function(data, trt) {
  a <- data[[trt]]
  rep(1, length(a))
}

lmtp_one <- lmtp_tmle(
  data       = lmtp_data,
  trt        = c("A_1", "A_2"),
  outcome    = "Y",
  baseline   = NULL,
  time_vary  = list(c("L_1"), c("L_2")),
  shift      = shift_one,
  outcome_type = "continuous",
  learners_outcome = c("SL.glm"),
  learners_trt     = c("SL.glm"),
  folds = 2
)

contrast <- lmtp_contrast(lmtp_one, ref = lmtp_fit)
contrast

::: {.cell-output .cell-output-stdout}


  shift   ref estimate std.error conf.low conf.high p.value
1   1.1 0.118    0.985    0.0349    0.917      1.05  <0.001

:::

The contrast estimate corresponds to the ATE under “always treat” vs “never treat” — but computed within the LMTP framework, with valid inference and explicit positivity handling.

10.6 Choosing among the three

Method Strengths Weaknesses
GPS (Hirano-Imbens) Conceptually transparent; widely understood Parametric assumptions on GPS; sensitive to bandwidth
Doubly-robust (Kennedy 2017) Robust to misspecification; asymptotically efficient More complex; nuisance models need careful tuning
LMTP Handles continuous + multivalued + time-varying uniformly; respects positivity More setup; requires defining a shift function

In practice:

  • For simple cross-sectional continuous treatment with good overlap, use Hirano-Imbens GPS or the Kennedy DR estimator.
  • For time-varying or modified-policy questions, use LMTP.
  • For multivalued treatments, generalize the propensity score to a multinomial model and use the same DR machinery.

10.7 When to use these methods

These methods are needed whenever:

  1. The treatment variable takes on more than two values (continuous, ordinal, or categorical with > 2 categories).
  2. The research question concerns the shape of the dose-response curve, not just a single contrast.
  3. The hypothetical intervention is best expressed as a shift in the distribution of treatment rather than fixed assignment.

For strictly binary, single-time-period problems, the standard Estimation and Heterogeneous Effects methods are sufficient.

10.8 Summary

  • Continuous and multivalued treatments require estimating an entire dose-response curve \(\mu(d) = \mathbb{E}[Y(d)]\), not a single scalar.
  • Generalized propensity score (GPS) extends propensity-score methods to continuous \(D\); Hirano-Imbens (2004) is the canonical reference.
  • Doubly-robust dose-response (Kennedy 2017) combines GPS with an outcome model; causaldrf::bart_est provides a BART-based implementation with native uncertainty quantification.
  • LMTP (Diaz-Hejazi 2020, lmtp package) is the modern framework for continuous, multivalued, and time-varying treatments under modified policy interventions; respects positivity by construction.
  • See the companion blog chapter on LMTP for an extended treatment with worked time-varying examples.