Comparing coefficients across regressions is common. Chow test is one of them. If you’d like to compare coefficients of regressions for two subsets, that’s the original Chow test.
The idea is to interact the subset indicator with all the covariates or only the covariate you are interested (treatment). If you only interact the dummy with the treatment variable, then you are assuming all other covariates have the same effect across the two subsets. This may or may not be reasonable.
This post is inspired by Austin Nicholas (https://www.stata.com/statalist/archive/2009-11/msg01485.html). The case with overlapping samples is all from his code.
Let’s see a simple example:
estclearsysuse nlsw88, clearreg wage hours if southest sto southreg wage hours if !southest sto nonsouthsuest south nonsouthest sto suestgen hours1=hours*(south==1)gen hours2=hours*(south==0)reg wage south hours?est sto chowtest _b[hours1]-_b[hours2]=0esttab south nonsouth suest chow, nogaps mti
In the above example, we are interested in the effect of hours on wage for south and non-south subsets. The Chow test is to use the entire sample which has both south and nonsouth data. Then use the interaction of south indicator and hours to find the effect of hours for south and nonsouth. By including these two subsamples in the same regression, we can test the equality of the two coefficients.
The other way to do this is to use Stata’s “suest” command. This command basically take the two regressions and the variance covariance structure; then a test of the difference between two coefficients can be done. However, “suest” does not work for some commands. In my opinion, using interaction can be more flexible.
3.2 A comparison with two different outcomes
We can also use the same idea to compare the effect of some treatment on two different outcomes, if we have the same set of covariates. We just need to “stack” the two outcomes and run a pooled regression with some interactions.
Here is an example.
estclearsysuse nlsw88, clearreg south wage hours est sto southreg smsa wage hours est sto smsasuest south smsaest sto suestpreservegen Y1=southgen Y2=smsagen id=_nreshapelong Y, i(id) j(subsample)gen wage1=wage*(subsample==1)gen wage2=wage*(subsample==2)gen hours1=hours*(subsample==1)gen hours2=hours*(subsample==2)reg Y wage? hours? subsampletest _b[wage1]-_b[wage2]=0est sto stackedesttab south smsa suest stacked, nogaps mti
In this example, we are interested in comparing the effect of wage on south vs. smsa (not interesting, but just as an example). What I did is to reshape it to long format, stacking south and smsa as “Y”. Then creat interaction of other covariates with subsample indicator. Then run the regression with Y on the interaction terms.
3.3 Overlapping samples
What if we’d like to compare coefficients for two overlapping subsamples? As I mentioned, Austin Nichols gave the following example:
estclearsysuse nlsw88, clearta south smsareg wage hours if southest sto southreg wage hours if smsaest sto smsasuest south smsaest sto suestpreserveexpand 2bys idcode: g n=_nkeepif (n==1&south)|(n==2&smsa)g hours1=hours*!(n==1&south)g hours2=hours*!(n==2&smsa)reg wage hours? n, cl(idcode)est sto stackedrestoreesttab south smsa suest stacked, nogaps mti
(NLSW, 1988 extract)
Lives in | Lives in SMSA
the south | Not SMSA SMSA | Total
-----------+----------------------+----------
Not south | 308 996 | 1,304
South | 357 585 | 942
-----------+----------------------+----------
Total | 665 1,581 | 2,246
Source | SS df MS Number of obs = 938
-------------+---------------------------------- F(1, 936) = 12.47
Model | 344.732583 1 344.732583 Prob > F = 0.0004
Residual | 25866.3404 936 27.634979 R-squared = 0.0132
-------------+---------------------------------- Adj R-squared = 0.0121
Total | 26211.0729 937 27.973397 Root MSE = 5.2569
------------------------------------------------------------------------------
wage | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
hours | .0623497 .0176532 3.53 0.000 .0277053 .0969941
_cons | 4.520583 .6957145 6.50 0.000 3.155242 5.885923
------------------------------------------------------------------------------
Source | SS df MS Number of obs = 1,578
-------------+---------------------------------- F(1, 1576) = 46.48
Model | 1594.14881 1 1594.14881 Prob > F = 0.0000
Residual | 54048.2539 1,576 34.2945773 R-squared = 0.0286
-------------+---------------------------------- Adj R-squared = 0.0280
Total | 55642.4027 1,577 35.2837049 Root MSE = 5.8562
------------------------------------------------------------------------------
wage | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
hours | .0953636 .0139872 6.82 0.000 .0679281 .1227991
_cons | 4.861519 .5443826 8.93 0.000 3.793729 5.929309
------------------------------------------------------------------------------
Simultaneous results for south, smsa Number of obs = 1,934
------------------------------------------------------------------------------
| Robust
| Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
south_mean |
hours | .0623497 .0174432 3.57 0.000 .0281617 .0965378
_cons | 4.520583 .6599613 6.85 0.000 3.227082 5.814083
-------------+----------------------------------------------------------------
south_lnvar |
_cons | 3.319082 .1435356 23.12 0.000 3.037758 3.600407
-------------+----------------------------------------------------------------
smsa_mean |
hours | .0953636 .0132806 7.18 0.000 .069334 .1213931
_cons | 4.861519 .4842914 10.04 0.000 3.912325 5.810713
-------------+----------------------------------------------------------------
smsa_lnvar |
_cons | 3.534987 .0910825 38.81 0.000 3.356469 3.713506
------------------------------------------------------------------------------
(2,246 observations created)
(1,969 observations deleted)
(7 missing values generated)
(7 missing values generated)
Linear regression Number of obs = 2,516
F(3, 1933) = 40.81
Prob > F = 0.0000
R-squared = 0.0399
Root MSE = 5.6403
(Std. err. adjusted for 1,934 clusters in idcode)
------------------------------------------------------------------------------
| Robust
wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
hours1 | .0953636 .0132886 7.18 0.000 .0693022 .121425
hours2 | .0623497 .0174536 3.57 0.000 .0281198 .0965796
n | .3409364 .6124145 0.56 0.578 -.8601261 1.541999
_cons | 4.179646 1.177889 3.55 0.000 1.869579 6.489713
------------------------------------------------------------------------------
----------------------------------------------------------------------------
(1) (2) (3) (4)
south smsa suest stacked
----------------------------------------------------------------------------
main
hours 0.0623*** 0.0954*** 0.0623***
(3.53) (6.82) (3.57)
hours1 0.0954***
(7.18)
hours2 0.0623***
(3.57)
n 0.341
(0.56)
_cons 4.521*** 4.862*** 4.521*** 4.180***
(6.50) (8.93) (6.85) (3.55)
----------------------------------------------------------------------------
south_lnvar
_cons 3.319***
(23.12)
----------------------------------------------------------------------------
smsa_mean
hours 0.0954***
(7.18)
_cons 4.862***
(10.04)
----------------------------------------------------------------------------
smsa_lnvar
_cons 3.535***
(38.81)
----------------------------------------------------------------------------
N 938 1578 1934 2516
----------------------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
3.4 IV regression
What about IV regression?
sysuse nlsw88, clearivregress 2sls wage (hours=union) if southivregress 2sls wage (hours=union) if !southgen hours1=hours*(south==1)gen hours2=hours*(south==0)gen union1=union*(south==1)gen union2=union*(south==0)ivregress 2sls wage south (hours? = union?)
We can see same Chow kind of test works, with IV regression, if we have the right interaction terms.
3.5 IV with fixed effects
However, when doing with a fixed effet IV, I seem to have difficulties. In this example, I use “reghdfe” to do an IV regression with fixed effect. We can also use “xtivreg2”, but “ivreghdfe” is supposed to be faster.
sysuse nlsw88, cleargen hours1=hours*(south==1)gen hours2=hours*(south==0)gen union1=union*(south==1)gen union2=union*(south==0)ivreghdfe wage (hours=union) if south, a(race) cluster(race)ivreghdfe wage (hours=union) if !south, a(race) cluster(race)ivreghdfe wage south (hours? = union?) , a(race) cluster(race)
already preserved
r(621);
(NLSW, 1988 extract)
(4 missing values generated)
(4 missing values generated)
(368 missing values generated)
(368 missing values generated)
(MWFE estimator converged in 1 iterations)
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on race
Number of clusters (race) = 3 Number of obs = 798
F( 1, 2) = 69.28
Prob > F = 0.0141
Total (centered) SS = 12186.8806 Centered R2 = -6.4527
Total (uncentered) SS = 12186.8806 Uncentered R2 = -6.4527
Residual SS = 90825.44631 Root MSE = 10.68
------------------------------------------------------------------------------
| Robust
wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
hours | 1.107099 .133012 8.32 0.014 .5347947 1.679404
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic): 1.756
Chi-sq(1) P-val = 0.1852
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 3.233
(Kleibergen-Paap rk Wald F statistic): 13.977
Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 0.000
(equation exactly identified)
------------------------------------------------------------------------------
Instrumented: hours
Excluded instruments: union
Partialled-out: _cons
nb: total SS, model F and R2s are after partialling-out;
any small-sample adjustments include partialled-out
variables in regressor count K
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
race | 3 3 0 *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation
(MWFE estimator converged in 1 iterations)
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on race
Number of clusters (race) = 3 Number of obs = 1079
F( 1, 2) = 1.41
Prob > F = 0.3564
Total (centered) SS = 19086.22115 Centered R2 = -2.3302
Total (uncentered) SS = 19086.22115 Uncentered R2 = -2.3302
Residual SS = 63560.97528 Root MSE = 7.682
------------------------------------------------------------------------------
| Robust
wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
hours | .7023322 .5905772 1.19 0.356 -1.838717 3.243381
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic): 0.789
Chi-sq(1) P-val = 0.3745
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 5.501
(Kleibergen-Paap rk Wald F statistic): 3.772
Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 0.000
(equation exactly identified)
------------------------------------------------------------------------------
Instrumented: hours
Excluded instruments: union
Partialled-out: _cons
nb: total SS, model F and R2s are after partialling-out;
any small-sample adjustments include partialled-out
variables in regressor count K
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
race | 3 3 0 *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation
(MWFE estimator converged in 1 iterations)
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on race
Number of clusters (race) = 3 Number of obs = 1877
F( 3, 2) = 1036.24
Prob > F = 0.0010
Total (centered) SS = 32142.319 Centered R2 = -3.9041
Total (uncentered) SS = 32142.319 Uncentered R2 = -3.9041
Residual SS = 157627.5718 Root MSE = 9.174
------------------------------------------------------------------------------
| Robust
wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
hours1 | 1.142036 .0650507 17.56 0.003 .8621454 1.421927
hours2 | .6880691 .5265185 1.31 0.321 -1.577357 2.953495
south | -19.74773 22.10811 -0.89 0.466 -114.8712 75.37577
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic): 1.793
Chi-sq(1) P-val = 0.1806
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 3.584
(Kleibergen-Paap rk Wald F statistic): 8.484
Stock-Yogo weak ID test critical values: 10% maximal IV size 7.03
15% maximal IV size 4.58
20% maximal IV size 3.95
25% maximal IV size 3.63
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Warning: estimated covariance matrix of moment conditions not of full rank.
overidentification statistic not reported, and standard errors and
model tests should be interpreted with caution.
Possible causes:
number of clusters insufficient to calculate robust covariance matrix
singleton dummy variable (dummy with one 1 and N-1 0s or vice versa)
partial option may address problem.
------------------------------------------------------------------------------
Instrumented: hours1 hours2
Included instruments: south
Excluded instruments: union1 union2
Partialled-out: _cons
nb: total SS, model F and R2s are after partialling-out;
any small-sample adjustments include partialled-out
variables in regressor count K
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
race | 3 3 0 *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation
We can see I failed to replicate the first two regressions in the third regression. Why? Because we’ll need the fixed effect to be interacted with the subsample indicator to make it right.
Here is another try:
sysuse nlsw88, cleargen hours1=hours*(south==1)gen hours2=hours*(south==0)gen union1=union*(south==1)gen union2=union*(south==0)gen race1=race*(south==1)gen race2=race*(south==0)ivreghdfe wage (hours=union) if south, a(race) cluster(race) ivreghdfe wage (hours=union) if !south, a(race) cluster(race)ivregress 2sls wage south (hours? = union?) i.race?, cluster(race)
already preserved
r(621);
(NLSW, 1988 extract)
(4 missing values generated)
(4 missing values generated)
(368 missing values generated)
(368 missing values generated)
(MWFE estimator converged in 1 iterations)
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on race
Number of clusters (race) = 3 Number of obs = 798
F( 1, 2) = 69.28
Prob > F = 0.0141
Total (centered) SS = 12186.8806 Centered R2 = -6.4527
Total (uncentered) SS = 12186.8806 Uncentered R2 = -6.4527
Residual SS = 90825.44631 Root MSE = 10.68
------------------------------------------------------------------------------
| Robust
wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
hours | 1.107099 .133012 8.32 0.014 .5347947 1.679404
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic): 1.756
Chi-sq(1) P-val = 0.1852
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 3.233
(Kleibergen-Paap rk Wald F statistic): 13.977
Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 0.000
(equation exactly identified)
------------------------------------------------------------------------------
Instrumented: hours
Excluded instruments: union
Partialled-out: _cons
nb: total SS, model F and R2s are after partialling-out;
any small-sample adjustments include partialled-out
variables in regressor count K
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
race | 3 3 0 *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation
(MWFE estimator converged in 1 iterations)
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on race
Number of clusters (race) = 3 Number of obs = 1079
F( 1, 2) = 1.41
Prob > F = 0.3564
Total (centered) SS = 19086.22115 Centered R2 = -2.3302
Total (uncentered) SS = 19086.22115 Uncentered R2 = -2.3302
Residual SS = 63560.97528 Root MSE = 7.682
------------------------------------------------------------------------------
| Robust
wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
hours | .7023322 .5905772 1.19 0.356 -1.838717 3.243381
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic): 0.789
Chi-sq(1) P-val = 0.3745
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 5.501
(Kleibergen-Paap rk Wald F statistic): 3.772
Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38
15% maximal IV size 8.96
20% maximal IV size 6.66
25% maximal IV size 5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 0.000
(equation exactly identified)
------------------------------------------------------------------------------
Instrumented: hours
Excluded instruments: union
Partialled-out: _cons
nb: total SS, model F and R2s are after partialling-out;
any small-sample adjustments include partialled-out
variables in regressor count K
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
race | 3 3 0 *|
-----------------------------------------------------+
* = FE nested within cluster; treated as redundant for DoF computation
note: 3.race1 omitted because of collinearity.
note: 3.race2 omitted because of collinearity.
Instrumental-variables 2SLS regression Number of obs = 1,877
Wald chi2(7) = 101696.08
Prob > chi2 = 0.0000
Root MSE = 9.0693
(Std. err. adjusted for 3 clusters in race)
------------------------------------------------------------------------------
| Robust
wage | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
hours1 | 1.107099 .1085357 10.20 0.000 .8943732 1.319825
hours2 | .7023322 .4819806 1.46 0.145 -.2423324 1.646997
south | -21.52299 22.35225 -0.96 0.336 -65.33259 22.28662
|
race1 |
1 | 3.054296 .1451752 21.04 0.000 2.769758 3.338834
2 | 1.987268 .176538 11.26 0.000 1.64126 2.333277
3 | 0 (omitted)
|
race2 |
1 | -.4816127 .434723 -1.11 0.268 -1.333654 .3704287
2 | -2.065527 .7694573 -2.68 0.007 -3.573635 -.5574181
3 | 0 (omitted)
|
_cons | -17.01633 18.01689 -0.94 0.345 -52.32879 18.29614
------------------------------------------------------------------------------
Endogenous: hours1 hours2
Exogenous: south 1.race1 2.race1 1.race2 2.race2 union1 union2
This works. Basically we use dummies which are interactions of subsample indicator and the fixed effect dummies. This would not work if we have a lot of fixed effect units.
But we can trick “reghdfe” to use a two way fixed effect option:
This way we can do a test to see whether hours effect differs between these two samples.
already preserved
r(621);
(NLSW, 1988 extract)
(4 missing values generated)
(4 missing values generated)
(368 missing values generated)
(368 missing values generated)
(MWFE estimator converged in 2 iterations)
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity and clustering on race
Number of clusters (race) = 3 Number of obs = 1877
F( 2, 2) = 13010.62
Prob > F = 0.0001
Total (centered) SS = 31273.10174 Centered R2 = -3.9367
Total (uncentered) SS = 31273.10174 Uncentered R2 = -3.9367
Residual SS = 154386.4216 Root MSE = 9.089
------------------------------------------------------------------------------
| Robust
wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
hours1 | 1.107099 .1331773 8.31 0.014 .5340838 1.680115
hours2 | .7023322 .5914076 1.19 0.357 -1.84229 3.246954
south | 0 (omitted)
------------------------------------------------------------------------------
Underidentification test (Kleibergen-Paap rk LM statistic): 0.000
Chi-sq(1) P-val = 1.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 3.793
(Kleibergen-Paap rk Wald F statistic): 0.000
Stock-Yogo weak ID test critical values: 10% maximal IV size 7.03
15% maximal IV size 4.58
20% maximal IV size 3.95
25% maximal IV size 3.63
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Warning: estimated covariance matrix of moment conditions not of full rank.
overidentification statistic not reported, and standard errors and
model tests should be interpreted with caution.
Possible causes:
number of clusters insufficient to calculate robust covariance matrix
singleton dummy variable (dummy with one 1 and N-1 0s or vice versa)
partial option may address problem.
------------------------------------------------------------------------------
Collinearities detected among instruments: 1 instrument(s) dropped
Instrumented: hours1 hours2
Included instruments: south
Excluded instruments: union1 union2
Partialled-out: _cons
nb: total SS, model F and R2s are after partialling-out;
any small-sample adjustments include partialled-out
variables in regressor count K
------------------------------------------------------------------------------
Absorbed degrees of freedom:
-----------------------------------------------------+
Absorbed FE | Categories - Redundant = Num. Coefs |
-------------+---------------------------------------|
race1 | 4 0 4 |
race2 | 4 2 2 |
-----------------------------------------------------+
( 1) hours1 - hours2 = 0
F( 1, 2) = 0.31
Prob > F = 0.6325
3.6 Chow test with different covariates
What if we want to compare coefficients across equations with different covariates? We can still do Chow test.
Say you have \[ Y_1 = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \epsilon_1\]\[ Y_2 = \gamma_0 + \gamma_1 X_1 + \epsilon_2\]
The way you can think of the second equation is that you still have \(X_2\) but it is just a constant, which means it will go into the constant term. \[ Y_2 = \gamma_0 + \gamma_1 X_1 + \gamma_2 C + \epsilon_2\]
So we can just replace \(X_2\) with a constant in the sample for \(Y_2\) and still do Chow test.
sysuse nlsw88, clearreg south wage hours tenureest sto southreg smsa wage hoursest sto smsasuest south smsaest sto suestpreservegen Y1=southgen Y2=smsagen id=_nreshapelong Y, i(id) j(subsample)gen wage1=wage*(subsample==1)gen wage2=wage*(subsample==2)gen hours1=hours*(subsample==1)gen hours2=hours*(subsample==2)replace tenure=1 if subsample==2gen tenure1= tenure*(subsample==1)gen tenure2= tenure*(subsample==2)reg Y wage? hours? tenure? subsampleest sto chowtest _b[hours1]-_b[hours2]=0esttab suest chow, nogaps mti
We can see Chow type of “stacking” method generates the same result as “suest” in terms of point estimates. For standard errors you can use “robust” or “cluster” option in “reg” command.
3.7 Conclusion
For testing cross equations hypotheses, we can use “suest” or Chow type “stacking” method. Sometimes people use “sureg”. I prefer not use “sureg”. It is a GLS estimator for all the equations together. It relies on assumptions of the error term of the whole system. It assumes homoscedasticity for example. If it’s true, then GLS is more efficient; if not, then biased. If the equations have the same covariates, then it returns the same coefficient estimates as the single equation estimates. If different covariates, then different estimates as the single equation estimates.
The nice thing about Chow test is that it is very flexible, and it does not rely on Stata’s internal functions. You can do it with R or other programs. And it works for complicated models too, usually, as long as the single equation works. “suest” needs the results stored before hand, which may not be available even within stata.
---title: "Chow test and more"date: "2022-04-06"---## Chow testComparing coefficients across regressions is common. Chow test is one of them. If you'd like to compare coefficients of regressions for two subsets, that's the original Chow test. The idea is to interact the subset indicator with all the covariates or only the covariate you are interested (treatment). If you only interact the dummy with the treatment variable, then you are assuming all other covariates have the same effect across the two subsets. This may or may not be reasonable.This post is inspired by Austin Nicholas (https://www.stata.com/statalist/archive/2009-11/msg01485.html). The case with overlapping samples is all from his code.Let's see a simple example:```{r}#| label: setup#| include: falselibrary(Statamarkdown)stataexe <-find_stata()#stataexe <- "/usr/local/bin/stata"knitr::opts_chunk$set(engine.path=list(stata=stataexe))``````{stata}*| label: stata1*| echo: true*| cache: true*| collectcode: trueest clearsysuse nlsw88, clearreg wage hours if southest sto southreg wage hours if !southest sto nonsouthsuest south nonsouthest sto suestgen hours1=hours*(south==1)gen hours2=hours*(south==0)reg wage south hours?est sto chowtest _b[hours1]-_b[hours2]=0esttab south nonsouth suest chow, nogaps mti```In the above example, we are interested in the effect of hours on wage for south and non-south subsets. The Chow test is to use the entire sample which has both south and nonsouth data. Then use the interaction of south indicator and hours to find the effect of hours for south and nonsouth. By including these two subsamples in the same regression, we can test the equality of the two coefficients. The other way to do this is to use Stata's "suest" command. This command basically take the two regressions and the variance covariance structure; then a test of the difference between two coefficients can be done. However, "suest" does not work for some commands. In my opinion, using interaction can be more flexible.## A comparison with two different outcomesWe can also use the same idea to compare the effect of some treatment on two different outcomes, if we have the same set of covariates. We just need to "stack" the two outcomes and run a pooled regression with some interactions.Here is an example.```{stata}*| label: stata2*| echo: true*| cache: true*| collectcode: trueest clearsysuse nlsw88, clearreg south wage hours est sto southreg smsa wage hours est sto smsasuest south smsaest sto suestpreservegen Y1=southgen Y2=smsagen id=_nreshape long Y, i(id) j(subsample)gen wage1=wage*(subsample==1)gen wage2=wage*(subsample==2)gen hours1=hours*(subsample==1)gen hours2=hours*(subsample==2)reg Y wage? hours? subsampletest _b[wage1]-_b[wage2]=0est sto stackedesttab south smsa suest stacked, nogaps mti```In this example, we are interested in comparing the effect of wage on south vs. smsa (not interesting, but just as an example). What I did is to reshape it to long format, stacking south and smsa as "Y". Then creat interaction of other covariates with subsample indicator. Then run the regression with Y on the interaction terms. ## Overlapping samplesWhat if we'd like to compare coefficients for two overlapping subsamples? As I mentioned, Austin Nichols gave the following example: ```{stata}*| label: stata3*| echo: true*| cache: true*| collectcode: trueest clearsysuse nlsw88, clearta south smsareg wage hours if southest sto southreg wage hours if smsaest sto smsasuest south smsaest sto suestpreserveexpand 2bys idcode: g n=_nkeep if (n==1&south)|(n==2&smsa)g hours1=hours*!(n==1&south)g hours2=hours*!(n==2&smsa)reg wage hours? n, cl(idcode)est sto stackedrestoreesttab south smsa suest stacked, nogaps mti```## IV regressionWhat about IV regression?```{stata}*| label: stata4*| echo: true*| cache: true*| collectcode: truesysuse nlsw88, clearivregress 2sls wage (hours=union) if southivregress 2sls wage (hours=union) if !southgen hours1=hours*(south==1)gen hours2=hours*(south==0)gen union1=union*(south==1)gen union2=union*(south==0)ivregress 2sls wage south (hours? = union?)```We can see same Chow kind of test works, with IV regression, if we have the right interaction terms.## IV with fixed effectsHowever, when doing with a fixed effet IV, I seem to have difficulties. In this example, I use "reghdfe" to do an IV regression with fixed effect. We can also use "xtivreg2", but "ivreghdfe" is supposed to be faster.```{stata}*| label: stata5*| echo: true*| cache: true*| collectcode: truesysuse nlsw88, cleargen hours1=hours*(south==1)gen hours2=hours*(south==0)gen union1=union*(south==1)gen union2=union*(south==0)ivreghdfe wage (hours=union) if south, a(race) cluster(race)ivreghdfe wage (hours=union) if !south, a(race) cluster(race)ivreghdfe wage south (hours? = union?) , a(race) cluster(race) ```We can see I failed to replicate the first two regressions in the third regression. Why? Because we'll need the fixed effect to be interacted with the subsample indicator to make it right.Here is another try:```{stata}*| label: stata6*| echo: true*| cache: true*| collectcode: truesysuse nlsw88, cleargen hours1=hours*(south==1)gen hours2=hours*(south==0)gen union1=union*(south==1)gen union2=union*(south==0)gen race1=race*(south==1)gen race2=race*(south==0)ivreghdfe wage (hours=union) if south, a(race) cluster(race) ivreghdfe wage (hours=union) if !south, a(race) cluster(race)ivregress 2sls wage south (hours? = union?) i.race?, cluster(race) ```This works. Basically we use dummies which are interactions of subsample indicator and the fixed effect dummies. This would not work if we have a lot of fixed effect units.But we can trick "reghdfe" to use a two way fixed effect option:This way we can do a test to see whether hours effect differs between these two samples.```{stata}*| label: stata7*| echo: true*| cache: true*| collectcode: truesysuse nlsw88, cleargen hours1=hours*(south==1)gen hours2=hours*(south==0)gen union1=union*(south==1)gen union2=union*(south==0)gen race1=race*(south==1)gen race2=race*(south==0)ivreghdfe wage south (hours? = union?) , a(race1 race2) cluster(race)test _b[hours1]-_b[hours2]=0```## Chow test with different covariatesWhat if we want to compare coefficients across equations with different covariates? We can still do Chow test.Say you have$$ Y_1 = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \epsilon_1$$$$ Y_2 = \gamma_0 + \gamma_1 X_1 + \epsilon_2$$The way you can think of the second equation is that you still have $X_2$ but it is just a constant, which means it will go into the constant term.$$ Y_2 = \gamma_0 + \gamma_1 X_1 + \gamma_2 C + \epsilon_2$$So we can just replace $X_2$ with a constant in the sample for $Y_2$ and still do Chow test.```{stata}*| label: stata8*| echo: true*| cache: true*| collectcode: truesysuse nlsw88, clearreg south wage hours tenureest sto southreg smsa wage hoursest sto smsasuest south smsaest sto suestpreservegen Y1=southgen Y2=smsagen id=_nreshape long Y, i(id) j(subsample)gen wage1=wage*(subsample==1)gen wage2=wage*(subsample==2)gen hours1=hours*(subsample==1)gen hours2=hours*(subsample==2)replace tenure=1 if subsample==2gen tenure1= tenure*(subsample==1)gen tenure2= tenure*(subsample==2)reg Y wage? hours? tenure? subsampleest sto chowtest _b[hours1]-_b[hours2]=0esttab suest chow, nogaps mti```We can see Chow type of "stacking" method generates the same result as "suest" in terms of point estimates. For standard errors you can use "robust" or "cluster" option in "reg" command.## ConclusionFor testing cross equations hypotheses, we can use "suest" or Chow type "stacking" method. Sometimes people use "sureg". I prefer not use "sureg". It is a GLS estimator for all the equations together. It relies on assumptions of the error term of the whole system. It assumes homoscedasticity for example. If it's true, then GLS is more efficient; if not, then biased. If the equations have the same covariates, then it returns the same coefficient estimates as the single equation estimates. If different covariates, then different estimates as the single equation estimates. The nice thing about Chow test is that it is very flexible, and it does not rely on Stata's internal functions. You can do it with R or other programs. And it works for complicated models too, usually, as long as the single equation works. "suest" needs the results stored before hand, which may not be available even within stata.