11 Treatment effects and matching

Published

January 10, 2019

11.1 Treatment effects in observational studies

Despite the popularity of randomized experiements in economics nowadays, most situations we have observational data in economic studies. One reason is experiemnts are expensive; the other reason is that sometimes it is simply not feasible to have experiments. If we have observational data, and we’d like to draw causal conclusions, then we have a few different situations. The worse situation is that we have an endogenous treatement. Or to say, we have unobserved confounders. In that case, we need instrumental variables. However instruments are hard to find, and even harder to justify. If we assume we don’t have unobserved counfounders. Or to say, we have conditional independence of the treatment variable. That is, conditional on other variables in the model, there is no unobserved confounders. If that assumption holds, then we can make causal inference.

Stata has a set of eteffects and teffects commands are designed for treatment effects. eteffects is for treatment effects when you have endogenous treatment. In that case, you’ll need instruments to model treatment at first stage. Then eteffects use control function approach for the second stage of modeling treatment effects on outcome. In this blog, we are trying to understand how teffects works. That is, when we don’t need an instruments, or say we assume no unobserved confounders.

11.1.1 Regression adjustement (RA)

The RA method is to allow all coveriates’ effects differ between treatment and control. That is, it is the same model as an outcome model with treatment interacting with all other coveriates.

11.1.1.1 Example:

clear
webuse bweightex
teffects ra (bweight prenatal1 mmarried mage fbaby) (mbsmoke)
reg bweight i.mbsmoke##c.(prenatal1 mmarried mage fbaby)
margins r.mbsmoke


. clear

. webuse bweightex
(Hypothetical birthweight data)

. teffects ra (bweight prenatal1 mmarried mage fbaby) (mbsmoke)

Iteration 0:  EE criterion = 1.223e-24  
Iteration 1:  EE criterion = 1.792e-25  

Treatment-effects estimation                    Number of obs     =         60
Estimator      : regression adjustment
Outcome model  : linear
Treatment model: none
------------------------------------------------------------------------------
             |               Robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATE          |
     mbsmoke |
    (Smoker  |
         vs  |
 Nonsmoker)  |  -389.3099   37.00882   -10.52   0.000    -461.8458   -316.7739
-------------+----------------------------------------------------------------
POmean       |
     mbsmoke |
  Nonsmoker  |   3613.769   29.97438   120.56   0.000     3555.021    3672.518
------------------------------------------------------------------------------

. reg bweight i.mbsmoke##c.(prenatal1 mmarried mage fbaby)

      Source |       SS           df       MS      Number of obs   =        60
-------------+----------------------------------   F(9, 50)        =     18.11
       Model |  1376306.34         9  152922.927   Prob > F        =    0.0000
    Residual |  422241.592        50  8444.83184   R-squared       =    0.7652
-------------+----------------------------------   Adj R-squared   =    0.7230
       Total |  1798547.93        59  30483.8633   Root MSE        =    91.896

------------------------------------------------------------------------------
     bweight | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     mbsmoke |
     Smoker  |  -32.87702   193.2827    -0.17   0.866    -421.0967    355.3426
   prenatal1 |   -23.8431   38.31014    -0.62   0.537    -100.7913    53.10507
    mmarried |  -40.39753   50.21942    -0.80   0.425    -141.2662    60.47114
        mage |    27.9537   7.671423     3.64   0.001     12.54519    43.36221
       fbaby |  -2.030497   40.76029    -0.05   0.960    -83.89995    79.83896
             |
     mbsmoke#|
 c.prenatal1 |
     Smoker  |   8.436558   56.38855     0.15   0.882    -104.8232    121.6963
             |
     mbsmoke#|
  c.mmarried |
     Smoker  |   33.92362   61.34015     0.55   0.583     -89.2817    157.1289
             |
     mbsmoke#|
      c.mage |
     Smoker  |  -15.98608   9.006494    -1.77   0.082    -34.07616    2.103995
             |
     mbsmoke#|
     c.fbaby |
     Smoker  |   14.86782   57.58771     0.26   0.797    -100.8005    130.5362
             |
       _cons |   2976.807   151.0117    19.71   0.000     2673.491    3280.123
------------------------------------------------------------------------------

. margins r.mbsmoke

Contrasts of predictive margins                             Number of obs = 60
Model VCE: OLS

Expression: Linear prediction, predict()

------------------------------------------------
             |         df           F        P>F
-------------+----------------------------------
     mbsmoke |          1      116.06     0.0000
             |
 Denominator |         50
------------------------------------------------

------------------------------------------------------------------------
                       |            Delta-method
                       |   Contrast   std. err.     [95% conf. interval]
-----------------------+------------------------------------------------
               mbsmoke |
(Smoker vs Nonsmoker)  |  -389.3099   36.13675     -461.8927   -316.7271
------------------------------------------------------------------------

We can see the teffects return the same ATE(Average Treatment Effect) as the margins command after a regression with treatment interacting with all other covariates.

To estimate ATET (or ATT, Average Treatment effected on the Treated),

clear
webuse bweightex
teffects ra (bweight prenatal1 mmarried mage fbaby) (mbsmoke), atet
reg bweight i.mbsmoke##c.(prenatal1 mmarried mage fbaby)
margins r.mbsmoke, subpop(mbsmoke)


. clear

. webuse bweightex
(Hypothetical birthweight data)

. teffects ra (bweight prenatal1 mmarried mage fbaby) (mbsmoke), atet

Iteration 0:  EE criterion = 1.159e-24  
Iteration 1:  EE criterion = 5.133e-26  

Treatment-effects estimation                    Number of obs     =         60
Estimator      : regression adjustment
Outcome model  : linear
Treatment model: none
------------------------------------------------------------------------------
             |               Robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
     mbsmoke |
    (Smoker  |
         vs  |
 Nonsmoker)  |  -437.4721   53.43513    -8.19   0.000     -542.203   -332.7412
-------------+----------------------------------------------------------------
POmean       |
     mbsmoke |
  Nonsmoker  |   3693.639   53.80537    68.65   0.000     3588.182    3799.095
------------------------------------------------------------------------------

. reg bweight i.mbsmoke##c.(prenatal1 mmarried mage fbaby)

      Source |       SS           df       MS      Number of obs   =        60
-------------+----------------------------------   F(9, 50)        =     18.11
       Model |  1376306.34         9  152922.927   Prob > F        =    0.0000
    Residual |  422241.592        50  8444.83184   R-squared       =    0.7652
-------------+----------------------------------   Adj R-squared   =    0.7230
       Total |  1798547.93        59  30483.8633   Root MSE        =    91.896

------------------------------------------------------------------------------
     bweight | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     mbsmoke |
     Smoker  |  -32.87702   193.2827    -0.17   0.866    -421.0967    355.3426
   prenatal1 |   -23.8431   38.31014    -0.62   0.537    -100.7913    53.10507
    mmarried |  -40.39753   50.21942    -0.80   0.425    -141.2662    60.47114
        mage |    27.9537   7.671423     3.64   0.001     12.54519    43.36221
       fbaby |  -2.030497   40.76029    -0.05   0.960    -83.89995    79.83896
             |
     mbsmoke#|
 c.prenatal1 |
     Smoker  |   8.436558   56.38855     0.15   0.882    -104.8232    121.6963
             |
     mbsmoke#|
  c.mmarried |
     Smoker  |   33.92362   61.34015     0.55   0.583     -89.2817    157.1289
             |
     mbsmoke#|
      c.mage |
     Smoker  |  -15.98608   9.006494    -1.77   0.082    -34.07616    2.103995
             |
     mbsmoke#|
     c.fbaby |
     Smoker  |   14.86782   57.58771     0.26   0.797    -100.8005    130.5362
             |
       _cons |   2976.807   151.0117    19.71   0.000     2673.491    3280.123
------------------------------------------------------------------------------

. margins r.mbsmoke, subpop(mbsmoke)

Contrasts of predictive margins                           Number of obs   = 60
Model VCE: OLS                                            Subpop. no. obs = 30

Expression: Linear prediction, predict()

------------------------------------------------
             |         df           F        P>F
-------------+----------------------------------
     mbsmoke |          1       73.30     0.0000
             |
 Denominator |         50
------------------------------------------------

------------------------------------------------------------------------
                       |            Delta-method
                       |   Contrast   std. err.     [95% conf. interval]
-----------------------+------------------------------------------------
               mbsmoke |
(Smoker vs Nonsmoker)  |  -437.4721   51.09606     -540.1015   -334.8426
------------------------------------------------------------------------

It is the comparison of treatment’s effect on the potential outcomes, for the treated. That’s why in margins, we have subpop(mbsmoke) option.

11.1.2 Inverse Probability Weighting (IPW)

IPW estimators have two steps. The first step is to estimate the treatment model, that is, treatment as a function of some covariates. Usually a logit model is used. Then the probability of treatment is estimated. In the second stage, the inverse probability is used as weights to compute the outcome difference between treatment versus control units.

These steps produce consistent estimates of the effect parameters because the treatment is assumed to be independent of the potential outcomes after conditioning on the covariates.

We can manually conduct the two steps, but the nice thing about using Stata’s teffects is that it takes account of the noise of estimating probability in the first step when calculating standard errors in the second step.

11.1.2.1 example

clear
webuse cattaneo2
teffects ipw (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu, probit)
probit mbsmoke mmarried c.mage##c.mage fbaby medu
predict ps
replace ps = 1/ps if mbsmoke==1
replace ps = 1/(1-ps) if mbsmoke==0
reg bweight mbsmoke [pweight=ps]


. clear

. webuse cattaneo2
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138–154)

. teffects ipw (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu, probit)

Iteration 0:  EE criterion = 4.622e-21  
Iteration 1:  EE criterion = 8.453e-26  

Treatment-effects estimation                    Number of obs     =      4,642
Estimator      : inverse-probability weights
Outcome model  : weighted mean
Treatment model: probit
------------------------------------------------------------------------------
             |               Robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATE          |
     mbsmoke |
    (Smoker  |
         vs  |
 Nonsmoker)  |  -230.6886   25.81524    -8.94   0.000    -281.2856   -180.0917
-------------+----------------------------------------------------------------
POmean       |
     mbsmoke |
  Nonsmoker  |   3403.463   9.571369   355.59   0.000     3384.703    3422.222
------------------------------------------------------------------------------

. probit mbsmoke mmarried c.mage##c.mage fbaby medu

Iteration 0:  Log likelihood = -2230.7484  
Iteration 1:  Log likelihood = -2042.6734  
Iteration 2:  Log likelihood = -2040.5088  
Iteration 3:  Log likelihood = -2040.5061  
Iteration 4:  Log likelihood = -2040.5061  

Probit regression                                       Number of obs =  4,642
                                                        LR chi2(5)    = 380.48
                                                        Prob > chi2   = 0.0000
Log likelihood = -2040.5061                             Pseudo R2     = 0.0853

------------------------------------------------------------------------------
     mbsmoke | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
    mmarried |  -.6484821   .0526991   -12.31   0.000    -.7517705   -.5451938
        mage |   .1744327   .0352437     4.95   0.000     .1053562    .2435092
             |
      c.mage#|
      c.mage |  -.0032559   .0006462    -5.04   0.000    -.0045224   -.0019894
             |
       fbaby |  -.2175962   .0491066    -4.43   0.000    -.3138433    -.121349
        medu |  -.0863631   .0098692    -8.75   0.000    -.1057064   -.0670198
       _cons |  -1.558255   .4511589    -3.45   0.001     -2.44251       -.674
------------------------------------------------------------------------------

. predict ps
(option pr assumed; Pr(mbsmoke))

. replace ps = 1/ps if mbsmoke==1
(864 real changes made)

. replace ps = 1/(1-ps) if mbsmoke==0
(3,778 real changes made)

. reg bweight mbsmoke [pweight=ps]
(sum of wgt is 9,193.96990537643)

Linear regression                               Number of obs     =      4,642
                                                F(1, 4640)        =      79.08
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0389
                                                Root MSE          =     573.36

------------------------------------------------------------------------------
             |               Robust
     bweight | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     mbsmoke |  -230.6886   25.94182    -8.89   0.000    -281.5469   -179.8303
       _cons |   3403.463   9.616992   353.90   0.000     3384.609    3422.317
------------------------------------------------------------------------------

In the above example, we run teffects ipw and then manually replicate the two steps. The only thing different is the standard error for the treatment effect.

11.1.3 Doubly robust estimators

RA estimator estimates the outcome directly, IPW esitmates the treatment assignment. There is also a group of estimators called doubly-robust estimators. They estimate both stages, and only require one of these stages to be correctly specified.

Stata implemented the augmented-IPW (AIPW) combination proposed by Robins and Rotnitzky (1995) and the IPW-regression-adjust ment (IPWRA) combination proposed by Wooldridge (2010).

The AIPW estimator augments the IPW estimator with a correction term. The term removes the bias if the treatment model is wrong and the outcome model is correct, and the term goes to 0 if the treatment model is correct and the outcome model is wrong.

The IPWRA estimator uses IPW probability weights when performing RA. The weights do not affect the accuracy of the RA estimator if the treatment model is wrong and the outcome model is correct. The weights correct the RA estimator if the treatment model is correct and the outcome model is wrong.

I have not figured out how to do these two commands manually. So we’ll just run two examples of how they are used.

clear
webuse cattaneo2
teffects ipwra (bweight mmarried mage prenatal1 fbaby)  (mbsmoke mmarried mage prenatal1 fbaby)
teffects aipw (bweight mmarried mage prenatal1 fbaby) (mbsmoke mmarried mage prenatal1 fbaby)


. clear

. webuse cattaneo2
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138–154)

. teffects ipwra (bweight mmarried mage prenatal1 fbaby)  (mbsmoke mmarried mag
> e prenatal1 fbaby)

Iteration 0:  EE criterion = 9.066e-22  
Iteration 1:  EE criterion = 2.902e-26  

Treatment-effects estimation                    Number of obs     =      4,642
Estimator      : IPW regression adjustment
Outcome model  : linear
Treatment model: logit
------------------------------------------------------------------------------
             |               Robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATE          |
     mbsmoke |
    (Smoker  |
         vs  |
 Nonsmoker)  |  -238.7679   24.38353    -9.79   0.000    -286.5587    -190.977
-------------+----------------------------------------------------------------
POmean       |
     mbsmoke |
  Nonsmoker  |   3402.851   9.538741   356.74   0.000     3384.155    3421.546
------------------------------------------------------------------------------

. teffects aipw (bweight mmarried mage prenatal1 fbaby) (mbsmoke mmarried mage 
> prenatal1 fbaby)

Iteration 0:  EE criterion = 2.115e-22  
Iteration 1:  EE criterion = 1.343e-26  

Treatment-effects estimation                    Number of obs     =      4,642
Estimator      : augmented IPW
Outcome model  : linear by ML
Treatment model: logit
------------------------------------------------------------------------------
             |               Robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATE          |
     mbsmoke |
    (Smoker  |
         vs  |
 Nonsmoker)  |  -239.0294    24.2524    -9.86   0.000    -286.5632   -191.4955
-------------+----------------------------------------------------------------
POmean       |
     mbsmoke |
  Nonsmoker  |   3402.839   9.538926   356.73   0.000     3384.143    3421.535
------------------------------------------------------------------------------

11.2 Matching

First let’s emphasize that matching usually does not deal with endogenous treatment problem. Some people have the wrong impression that matching resolves the endogeneity problem, but in fact it only helps for selection on observables. If you have selection on unobservables, or unobserved confounders, matching does not help.

Matching aims to blance the distribution of covariates in the treatment and control groups. That’s all it does, to make comparisons between apples, not apples to oranges. In that sense, regression does that too. We use regression to adjust for differences between treatment and control. However, regression sometimes rely on extrapolation too much, when data do not overlap between treatment and control. It’s not that matching can do magic on non-overlapping data, but it can make it clear that how bad the non-overlapping problem is. Simply running regression blindly will not have the researchers realize the non-overlapping problem. Combining matching with regression is probably a better idea. That is, running regression on matched sample is recommended.

Here we introduced a few popular matching matheds implemented in Stata, namely nnmatch, psmatch, and cem.

11.2.1 Propensity score matching

The ideal situation of matching is exact matching; that is, treatment and control units can match to each other with exactly the same value of all covariates. This is often not possible, as the number of covariates increase and if they are not discrete. In high dimention (many covariates to match), it is very hard or even impossible to find exact matches. Therefore, how to reduce from high dimension to low dimenstion is key. Rosenbaum and Rubin 1983 proved that propensity score provides the one-dimension representation of high-dimension of covariates. Therefore, we only need to find matches based on propensity scores.

In Stata, teffects psmatch can do estimation after matching on propensity scores. Stata’s psmatch2 command has been popular for propensity score matching too. The nice thing of these commands is that it does two steps in one command: first it estimate the logit or probit model for propensity score, then match the treatment and control groups, then estimate the outcome equation on matched sample. The standard errors are correct based on all these steps. If we do this manually, standard errors will not be correct.

11.2.1.1 example

clear
webuse cattaneo2
teffects psmatch (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu), atet
psmatch2 mbsmoke mmarried c.mage##c.mage fbaby medu, logit ties
reg bweight mbsmoke   [aweight=_weight]


. clear

. webuse cattaneo2
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138–154)

. teffects psmatch (bweight) (mbsmoke mmarried c.mage##c.mage fbaby medu), atet

Treatment-effects estimation                   Number of obs      =      4,642
Estimator      : propensity-score matching     Matches: requested =          1
Outcome model  : matching                                     min =          1
Treatment model: logit                                        max =         74
------------------------------------------------------------------------------
             |              AI robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
     mbsmoke |
    (Smoker  |
         vs  |
 Nonsmoker)  |  -236.7848   26.57789    -8.91   0.000    -288.8765    -184.693
------------------------------------------------------------------------------

. psmatch2 mbsmoke mmarried c.mage##c.mage fbaby medu, logit ties

Logistic regression                                     Number of obs =  4,642
                                                        LR chi2(5)    = 375.00
                                                        Prob > chi2   = 0.0000
Log likelihood = -2043.2504                             Pseudo R2     = 0.0841

------------------------------------------------------------------------------
     mbsmoke | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
    mmarried |  -1.145706   .0918962   -12.47   0.000     -1.32582    -.965593
        mage |    .321518   .0638472     5.04   0.000     .1963798    .4466563
             |
      c.mage#|
      c.mage |  -.0060368   .0011849    -5.09   0.000    -.0083592   -.0037144
             |
       fbaby |  -.3864258   .0880445    -4.39   0.000    -.5589898   -.2138618
        medu |  -.1420833   .0173215    -8.20   0.000    -.1760328   -.1081338
       _cons |  -2.950915   .8102504    -3.64   0.000    -4.538976   -1.362853
------------------------------------------------------------------------------

. reg bweight mbsmoke   [aweight=_weight]
(sum of wgt is 1,728)

      Source |       SS           df       MS      Number of obs   =     3,671
-------------+----------------------------------   F(1, 3669)      =    151.87
       Model |    51455506         1    51455506   Prob > F        =    0.0000
    Residual |  1.2431e+09     3,669  338808.237   R-squared       =    0.0397
-------------+----------------------------------   Adj R-squared   =    0.0395
       Total |  1.2945e+09     3,670  352736.492   Root MSE        =    582.07

------------------------------------------------------------------------------
     bweight | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     mbsmoke |  -236.7848   19.21387   -12.32   0.000    -274.4557   -199.1138
       _cons |   3374.444   13.58626   248.37   0.000     3347.807    3401.082
------------------------------------------------------------------------------

.

In the above example, suppose we are interesting in mbsmoke’s effect on bweight. If we match smoking mothers with non-smoking mothers, by age, first babe, and education, the the teffects psmatch gives us the treatment effect on the treated. We can do this in two steps. First, by psmatch2 we generate _weight, which is a number for how many times a control unit is used to match to a treatment unit. Notice that to match teffects result, we use logit, and ties option. By default, teffects psmatch includes all ties (control units that have the same propensity scores that are close enough), but psmatch2 by default only include one. Then in the second step, we run a regression with _weight as the weight. Notice the standard errors will differ. We should use the standard errors reported in teffects.

11.2.2 Nearest neighbor matching

Stata has the nnmatch option for teffects. nnmatch by default uses Mahalanobis distance, it can also specify to have exact matching. The advantage is this is nonparametric.

clear
webuse cattaneo2
teffects nnmatch  (bweight mmarried mage fbaby medu) (mbsmoke) , atet
teffects psmatch (bweight) (mbsmoke mmarried mage fbaby medu), atet


. clear

. webuse cattaneo2
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138–154)

. teffects nnmatch  (bweight mmarried mage fbaby medu) (mbsmoke) , atet

Treatment-effects estimation                   Number of obs      =      4,642
Estimator      : nearest-neighbor matching     Matches: requested =          1
Outcome model  : matching                                     min =          1
Distance metric: Mahalanobis                                  max =         74
------------------------------------------------------------------------------
             |              AI robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
     mbsmoke |
    (Smoker  |
         vs  |
 Nonsmoker)  |  -239.2433   25.68524    -9.31   0.000    -289.5854   -188.9011
------------------------------------------------------------------------------

. teffects psmatch (bweight) (mbsmoke mmarried mage fbaby medu), atet

Treatment-effects estimation                   Number of obs      =      4,642
Estimator      : propensity-score matching     Matches: requested =          1
Outcome model  : matching                                     min =          1
Treatment model: logit                                        max =         74
------------------------------------------------------------------------------
             |              AI robust
     bweight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
ATET         |
     mbsmoke |
    (Smoker  |
         vs  |
 Nonsmoker)  |   -245.711   26.38675    -9.31   0.000    -297.4281   -193.9939
------------------------------------------------------------------------------

.

Notice the specification is different in these two models.

11.2.3 Coarsened Exact Matching (CEM)

CEM is not part of teffects. But Stata does have cem package implemented.

clear
webuse cattaneo2
cem  mmarried mage fbaby medu, treatment(mbsmoke)
reg bweight mbsmoke   [aweight=cem_weights]


. clear

. webuse cattaneo2
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138–154)

. cem  mmarried mage fbaby medu, treatment(mbsmoke)

Matching Summary:
-----------------
Number of strata: 274
Number of matched strata: 135

              0     1
      All  3778   864
  Matched  3421   827
Unmatched   357    37


Multivariate L1 distance: .25424705

Univariate imbalance:

               L1     mean      min      25%      50%      75%      max
mmarried  6.4e-15  8.4e-15        0        0        0        0        0
    mage   .04955  -.00887        1        0        1        0       -2
   fbaby  6.1e-15  6.1e-15        0        0        0        0        0
    medu   .03809  -.03316        0        0        0        0        0

. reg bweight mbsmoke   [aweight=cem_weights]
(sum of wgt is 4,248.00000000005)

      Source |       SS           df       MS      Number of obs   =     4,248
-------------+----------------------------------   F(1, 4246)      =    121.12
       Model |  40566772.9         1  40566772.9   Prob > F        =    0.0000
    Residual |  1.4221e+09     4,246  334928.257   R-squared       =    0.0277
-------------+----------------------------------   Adj R-squared   =    0.0275
       Total |  1.4627e+09     4,247   344401.26   Root MSE        =    578.73

------------------------------------------------------------------------------
     bweight | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
     mbsmoke |  -246.8017   22.42533   -11.01   0.000    -290.7671   -202.8364
       _cons |   3383.394   9.894625   341.94   0.000     3363.996    3402.793
------------------------------------------------------------------------------

.

In the above example, we use CEM on the same covariates, then apply weights in the final regression. The disadvantage is that we have not accounted for the noise in calculating those weights.

11.3 Modern matching in R: balance diagnostics and entropy weighting

The 2019 vintage of this chapter used Stata’s teffects throughout. R’s MatchIt and WeightIt packages have since become the standard for matching workflows, and cobalt provides publication-quality balance tables and plots that are now expected in applied papers.

11.3.1 Balance checking with cobalt

After any matching or weighting step, always inspect covariate balance. cobalt’s bal.tab() and love.plot() are the go-to tools:

library(MatchIt)
library(cobalt)

# Propensity score matching on Cattaneo birth-weight data
data("lalonde", package = "MatchIt")

m_out <- matchit(treat ~ age + educ + black + hispan + married + nodegree +
                   re74 + re75,
                 data = lalonde, method = "nearest", distance = "glm",
                 ratio = 1)

# Balance table: standardized mean differences before and after matching
bal.tab(m_out, stats = c("mean.diffs", "variance.ratios"), thresholds = c(m = 0.1))

# Love plot: visualize balance improvement
love.plot(m_out, stats = "mean.diffs", threshold = 0.1,
          abs = TRUE, var.order = "unadjusted")

The love plot shows each covariate’s standardized mean difference (SMD) before matching (open circles) and after (filled circles). The vertical line at SMD = 0.1 is the conventional threshold for “balanced.” If post-matching points cross the line, the matching is not improving balance and a different specification is needed.

11.3.2 Entropy balancing

Propensity score matching discards unmatched units and can reduce effective sample size substantially. Entropy balancing (Hainmueller 2012) keeps all units and reweights the control group so that weighted covariate moments exactly match the treatment group — without discarding anyone.

library(WeightIt)
library(cobalt)

# Entropy balancing: exact balance on means, variances, and covariances
w_out <- weightit(treat ~ age + educ + black + hispan + married + nodegree +
                    re74 + re75,
                  data = lalonde, method = "ebal", estimand = "ATT",
                  moments = 1)   # moments=1: balance means only
                                 # moments=2: balance means + variances

summary(w_out)

# Check balance
bal.tab(w_out, stats = "mean.diffs", thresholds = c(m = 0.1))
love.plot(w_out, threshold = 0.1, abs = TRUE)

# Weighted outcome regression
library(lmtest); library(sandwich)
m_eb <- lm(re78 ~ treat + age + educ + black + hispan + married + nodegree +
             re74 + re75,
           data = lalonde, weights = w_out$weights)
coeftest(m_eb, vcov = vcovHC(m_eb, type = "HC3"))["treat", ]

Method	Pros	Cons
PSM (nearest neighbor)	Intuitive; respects overlap	Discards units; balance not guaranteed
CEM	Exact balance in bins	Requires discretizing continuous vars
Entropy balancing	Exact mean balance; keeps all units	Weights can be extreme; no variance balance by default
IPW (logistic)	Simple; doubly-robust with outcome model	Extreme weights if overlap is poor

For most applications, entropy balancing or IPW with trimmed weights (via WeightIt’s trim()) is preferred over PSM: better effective sample size, guaranteed balance, and easy integration with doubly-robust estimators.

See also the weighting chapter for IPW estimators and the OLS-ATE chapter for regression-based approaches.