Causal Econometrics with Julia

Author

Xiang Ao

Published

July 4, 2026

Preface

Most applied questions are causal. Did a program raise earnings? Did a drug reduce mortality? Did a policy change behavior? With experimental data, a difference in means can sometimes answer the question. With observational data, it almost never can. Causal econometrics is the business of saying what would have to be true for a number we can compute to mean the causal object we want.

This book is a working guide in Julia. It is for applied researchers who already know regression and probability, and who want to see the methods carried out end to end. Each chapter gives the identifying assumptions and then runs code on a real or simulated data set. Where it matters, the influence functions, score equations, and weighting schemes are written out so the link between the formula and the Julia call is visible.

Julia is not the usual choice for this material. R remains the default language of applied causal inference, and a companion R volume — Introduction to Causal Econometrics with Observational Data — covers the same ground. A messier working notebook, Topics on Econometrics and Causal Inference, holds the rougher posts that fed into both books. The Julia version exists because some of the heavier estimators — TMLE on large samples, distributional difference-in-differences, fully Bayesian g-computation — are uncomfortably slow in R, and because Julia’s type system makes it possible to write small, focused estimation packages whose code is easy to read. Several such packages were written alongside this book and are used throughout: CausalEstimate.jl for unified TMLE and AIPW, CausalGraphs.jl for graph-based identification, Lavaan.jl for structural equation modeling, Crumble.jl for causal mediation, and a handful of smaller libraries for difference-in-differences, synthetic control, shift-share IV, and regression discontinuity. They are not requirements for following the text, but readers who want to see how an estimator is actually built will find the source short enough to read.

The book is organized in the order in which an applied project usually runs. Part I is identification. Part II is estimation. Part III is designs: difference-in-differences, synthetic control, instrumental variables, regression discontinuity, and shift-share IV. The remaining parts cover longitudinal settings, survival outcomes, mediation, and causal discovery. The appendix lists the Julia packages used in the examples.

The book is meant to be read at a desk with Julia running. Source files, datasets, and a Project.toml that pins package versions are available in the repository linked above; the “Edit this page” link at the foot of each chapter goes directly to the corresponding .qmd file. Corrections and suggestions are welcome through the issues tracker.

A note on the code you see. Hidden setup chunks (package loading, helper definitions) execute when the book is rendered but are not shown. Some visible code blocks are marked eval: false: these are displayed for readability but not run, either because they only show which packages a chapter uses or because they require a tool or package outside the book’s default environment (these are flagged where they appear). A block marked eval: false therefore shows intended usage rather than executed output.

How to cite

Ao, Xiang. Causal Econometrics with Julia. https://xiangao.github.io/causal_econometrics_julia/.

@book{ao_causal_econometrics_julia,
  author = {Ao, Xiang},
  title  = {Causal Econometrics with Julia},
  url    = {https://xiangao.github.io/causal_econometrics_julia/}
}