library(dagitty)
library(igraph)
library(causaleffect)
library(ggplot2)21 From Graph to Estimate
The two previous chapters covered discovery — algorithms that learn a graph from data. Discovery returns a graph, but a graph is not yet an estimate. To go from the graph to an actionable number, we need two more steps:
- Identification — given the graph and a target effect, is the effect a function of the observed-data distribution? If yes, derive the identifying functional. If no, declare non-identification (or place bounds).
- Estimation — fit the identifying functional to data, ideally with a doubly-robust or efficient estimator that yields a confidence interval.
In R, dagitty handles identification via the backdoor and front-door criteria; causaleffect implements the full Pearl–Shpitser ID algorithm for acyclic directed mixed graphs (ADMGs) with hidden confounders. Estimation is done directly from the identifying formula — a step that is explicit in R, making the connection between the formula and the estimator transparent.
21.1 Backdoor: standard adjustment
The simplest case. Treatment \(A\) has measured parents, and there is no unmeasured common cause between \(A\) and \(Y\) along any backdoor path. Standard backdoor adjustment applies.
# X is a common cause of A and Y
g_bd <- dagitty("dag { X -> A; X -> Y; A -> Y }")
exposures(g_bd) <- "A"
outcomes(g_bd) <- "Y"
sets <- adjustmentSets(g_bd, exposure = "A", outcome = "Y")
cat("Adjustment sets:\n")Adjustment sets:
print(sets){ X }
dagitty finds {X} as the minimal adjustment set. Adjusting on \(X\) blocks the only open backdoor path \(A \leftarrow X \rightarrow Y\).
set.seed(1)
n <- 2000
X <- rnorm(n)
A <- as.numeric(runif(n) < 1 / (1 + exp(-X))) # logistic: A depends on X
Y <- 2 * A + X + 0.5 * rnorm(n) # true ACE = 2
data_bd <- data.frame(X = X, A = A, Y = Y)
# AIPW (doubly-robust) estimator for the backdoor-adjusted ACE
# Outcome model: E[Y | A, X]
mu1 <- predict(lm(Y ~ A + X, data = data_bd), newdata = transform(data_bd, A = 1))
mu0 <- predict(lm(Y ~ A + X, data = data_bd), newdata = transform(data_bd, A = 0))
# Propensity score: P(A = 1 | X)
ps <- predict(glm(A ~ X, data = data_bd, family = binomial), type = "response")
# AIPW influence function
aipw <- mean(
(data_bd$A / ps) * (Y - mu1) - ((1 - data_bd$A) / (1 - ps)) * (Y - mu0) +
mu1 - mu0
)
cat(sprintf("Backdoor AIPW ACE (true = 2.0): %.3f\n", aipw))Backdoor AIPW ACE (true = 2.0): 1.990
# Bootstrap 95% CI
set.seed(99)
boot_bd <- replicate(500, {
idx <- sample.int(n, replace = TRUE)
bd <- data_bd[idx, ]
m1 <- predict(lm(Y ~ A + X, data = bd), newdata = transform(bd, A = 1))
m0 <- predict(lm(Y ~ A + X, data = bd), newdata = transform(bd, A = 0))
p <- predict(glm(A ~ X, data = bd, family = binomial), type = "response")
mean((bd$A / p) * (bd$Y - m1) - ((1 - bd$A) / (1 - p)) * (bd$Y - m0) + m1 - m0)
})
ci_bd <- quantile(boot_bd, c(0.025, 0.975))
cat(sprintf("95%% CI: [%.3f, %.3f]\n", ci_bd[1], ci_bd[2]))95% CI: [1.942, 2.043]
The AIPW estimator is doubly robust: it is consistent if either the outcome model or the propensity score is correctly specified, and efficient when both are.
21.2 Front-door: p-fixable effects
Now suppose there is an unmeasured common cause \(U\) of \(A\) and \(Y\), but a measured mediator \(M\) lies on the causal path \(A \rightarrow M \rightarrow Y\), with no arrow from \(U\) to \(M\) except through \(A\). Standard backdoor adjustment fails because \(U\) is unobserved, but the front-door criterion identifies the effect.
In dagitty, the bidirected edge \(A \leftrightarrow Y\) (representing the latent \(U\)) appears as A <-> Y. For causaleffect, the same edge is encoded as two directed edges A -> Y and Y -> A both marked with description = "U" in the igraph object.
# dagitty for display and adjustment-set check
g_fd <- dagitty("dag { A -> M; M -> Y; A <-> Y }")
exposures(g_fd) <- "A"
outcomes(g_fd) <- "Y"
cat("Backdoor adjustment sets (should be empty):\n")Backdoor adjustment sets (should be empty):
print(adjustmentSets(g_fd, exposure = "A", outcome = "Y"))
# causaleffect for ID algorithm: bidirected A <-> Y = two directed edges + description "U"
g_fd_ig <- graph_from_literal(A -+ M, M -+ Y, A -+ Y, Y -+ A)
# igraph sorts edges by source vertex: A-edges come before M-edges before Y-edges.
# Resulting order: A->M(1), A->Y(2), M->Y(3), Y->A(4). Bidirected pair = edges 2 & 4.
g_fd_ig <- set_edge_attr(g_fd_ig, "description", index = c(2, 4), value = "U")
expr_fd <- causal.effect(y = "Y", x = "A", G = g_fd_ig, simp = TRUE)
cat("\nIdentifying expression:\n")
Identifying expression:
cat(expr_fd, "\n")\sum_{M}P(M|A)\left(\sum_{A}P(Y|A,M)P(A)\right)
The expression returned is the front-door formula: \(\sum_M P(M \mid A) \sum_{A'} P(Y \mid A', M)\, P(A')\). We now estimate it directly.
set.seed(2)
n <- 3000
U <- rnorm(n) # unmeasured confounder
A_fd <- as.numeric(runif(n) < 1 / (1 + exp(-U))) # A depends on U
M_fd <- as.numeric(runif(n) < 1 / (1 + exp(-(0.5 + 2 * A_fd)))) # M depends on A only
Y_fd <- 1.5 * M_fd + 2 * U + 0.5 * rnorm(n) # Y depends on M and U
# True ACE: E[Y | do(A=1)] - E[Y | do(A=0)]
# = 1.5 * (E[M | A=1] - E[M | A=0])
true_EMA1 <- mean(1 / (1 + exp(-(0.5 + 2))))
true_EMA0 <- mean(1 / (1 + exp(-0.5)))
true_ace_fd <- 1.5 * (true_EMA1 - true_EMA0)
cat(sprintf("True front-door ACE = %.3f\n", true_ace_fd))True front-door ACE = 0.453
data_fd <- data.frame(A = A_fd, M = M_fd, Y = Y_fd)Estimate the front-door functional using the plug-in formula:
front_door_ace <- function(data) {
pA <- mean(data$A)
pM_given_A <- prop.table(table(data$A, data$M), margin = 1)
pY_given_AM <- with(data, tapply(Y, list(A, M), mean))
ey_do <- function(a) {
sum(sapply(c(0, 1), function(m) {
p_m_given_a <- pM_given_A[as.character(a), as.character(m)]
inner <- sum(sapply(c(0, 1), function(ap)
pY_given_AM[as.character(ap), as.character(m)] *
ifelse(ap == 1, pA, 1 - pA)))
p_m_given_a * inner
}))
}
ey_do(1) - ey_do(0)
}
est_fd <- front_door_ace(data_fd)
set.seed(77)
boot_fd <- replicate(500, {
idx <- sample.int(nrow(data_fd), replace = TRUE)
front_door_ace(data_fd[idx, ])
})
ci_fd <- quantile(boot_fd, c(0.025, 0.975))
cat(sprintf("Front-door ACE: %.3f [95%% CI: %.3f, %.3f]\n",
est_fd, ci_fd[1], ci_fd[2]))Front-door ACE: 0.499 [95% CI: 0.432, 0.577]
# Naïve mean difference for comparison
naive_ace <- mean(Y_fd[A_fd == 1]) - mean(Y_fd[A_fd == 0])
cat(sprintf("Naïve mean difference (biased!): %.3f\n", naive_ace))Naïve mean difference (biased!): 2.083
cat(sprintf("True ACE: %.3f\n", true_ace_fd))True ACE: 0.453
Despite \(U\) being unobserved, the front-door formula recovers the causal effect. The naïve difference is contaminated by the \(U \rightarrow A\) and \(U \rightarrow Y\) paths that the front-door route closes.
21.3 When identification fails
Not every ADMG yields an identified effect. Suppose \(A\) and \(Y\) have a hidden common cause and there is no front-door mediator. The effect is not identified.
# A → Y with unmeasured confounding (A <-> Y) and no mediator
g_unident_ig <- graph_from_literal(A -+ Y, Y -+ A)
g_unident_ig <- set_edge_attr(g_unident_ig, "description",
index = 1:2, value = "U")
result <- tryCatch(
causal.effect(y = "Y", x = "A", G = g_unident_ig, simp = TRUE),
error = function(e) conditionMessage(e)
)
cat("Result from causal.effect:\n", result, "\n")Result from causal.effect:
P(Y)
causaleffect returns an error or FALSE, indicating the effect is not identifiable. The honest response in applied work is to report the non-identifiability, then either:
- Find additional variables that close the unmeasured-confounding paths
- Apply sensitivity analysis to bound the effect under maintained assumptions
- Find an instrumental variable for a LATE-style identification (IV chapter)
A hidden assumption that everyone tacitly makes — “of course the confounding is small” — is not identification. Either the graph supports the effect or it does not.
21.4 End-to-end: discover, identify, estimate
The previous chapters’ discovery algorithms (PC, GES, FCI, RFCI) return graphs that can be passed directly to the identification step. The full workflow is:
library(pcalg)
# Step 1: discover the graph (chapters on causal discovery)
suffStat <- list(C = cor(data_obs), n = nrow(data_obs))
pc_fit <- pc(suffStat, indepTest = gaussCItest,
alpha = 0.05, p = ncol(data_obs))
# Step 2: check for backdoor adjustment set using dagitty
# (convert pcalg CPDAG → dagitty for convenience)
g_dag <- pcalg2dagitty(as(pc_fit@graph, "matrix"),
colnames(data_obs), type = "cpdag")
sets <- adjustmentSets(g_dag, exposure = "A", outcome = "Y")
# Step 3: if backdoor set found, estimate with AIPW
# Step 4: if not, call causaleffect::causal.effect() for the ID expression
# Step 5: estimate the returned functional by plug-in or regressionIn practice, the discovered graph is almost never trusted blindly. The applied-research workflow is:
- Hypothesise a graph based on subject-matter knowledge
- Discover a data-driven graph using PC/GES/FCI
- Compare the two — disagreement points to substantive questions
- Identify under both graphs separately
- Estimate under both, and report the range as part of the result
If the estimate is robust to the choice of graph, the conclusion is strong. If it flips sign depending on which graph you use, the data alone is not enough to answer the question.
21.5 Why this matters
The “graph → identification → estimation” pipeline closes the loop between causal-discovery papers and applied empirical work. Practically:
- Backdoor adjustment is not always available. Many real DAGs have hidden confounders. Front-door, nested, and general ID algorithms are the recourse, but they are too complex to derive by hand for non-trivial graphs.
- The right estimator depends on the identification strategy. AIPW-for-backdoor uses a different formula than the front-door plug-in. Matching the estimator to the identification result avoids the common mistake of plugging a graph into the wrong estimator.
- Reporting becomes more disciplined. Once the workflow forces you to specify a graph, it becomes harder to wave hands about “unconfoundedness” — you have to say exactly which variables are doing the unconfounding work.
21.6 Summary
- Discovery returns a graph; estimation returns a number. The bridge is the identification step.
dagittyimplements backdoor and front-door criteria;causaleffectimplements the full Pearl–Shpitser ID algorithm for ADMGs with bidirected edges (hidden confounders).- Backdoor case:
adjustmentSets()+ AIPW estimation. - Front-door case:
causal.effect()returns the functional; estimate it by direct plug-in or regression. - Non-identified case:
causal.effect()errors — report non-identifiability, not a biased point estimate. - End-to-end workflow: hypothesise a graph, run discovery as a check, identify, estimate, and report the range across plausible graphs.