When we have a panel data (repeated observations over time, or observations clustered at higher level), we usually think of two choices: random effect or fixed effect? Economists usually prefers fixed effect models, since it wipes out all within unit heterogeneity. Economists do not like random effect models since it has a big assumption: the random effects need to be uncorrelated to other covariates in the model. To see this, suppose we have
Suppose we have individuals \(i=1, ... , n\) measured at time \(t=1, ..., T\). Here \(c_i\) is the unobserved time-invariant individual effects. The difference between fixed and random effects is in how they handle \(c_i\).
Fixed effect models for a linear model can be implemented by one of these two methods: with dummies of individuals, or run an OLS with de-meaned \(y\) and \(x\). These two methods are equivalent. In a non-linear model, things are more difficult, except Poisson model, other non-lienar model with dummies suffer “incidental parameter” problem. The gold-standard is to do a conditional likelihood (conditional logit for example), which “obsorbs” the fixed effects in the likelihood function, therefore it’s not necessary to estimate them. Unfortunately most non-linear models do not have such nice conditional likelihood. In that case we can only hope the bias would be small (it does get smaller when you have deeper panel, that is , number of observations per individual).
Random effect models treat \(c_i\) as part of the error term. In that case, it comes the biggest drawback: the covariates have to be uncorrelated with the error term to have a consistent estimator. Therefore in the above equation, \(x\) has to be uncorrelated with \(c_i\), which economists in general do not think it’s realistic.
7.2 Time-invariant variables
Sometimes people are interested in the effect of time-invariant variables, thus the model
Fixed effect models cannot handle this, because \(\gamma\) is not identified because \(z_i\) is perfectly collinear with \(c_i\). Random effect can still be estimated, treating \(z_i\) simply as another covariate.
7.3 Between-within model
Usually we were told to do a Hausman test to see whether we should use fixed effect or random effect model. The basic idea is the random effect is more efficient if the assumptions are satisfied. If not, then fixed effect model is still consistent. The Hausman test is to compare the difference between the two. If the difference is small then stick with random effect. If it’s big, then fixed effect should be preferred since it’s consistent.
However, there is a between-within model (BW) that can incorporate both. Neuhaus and Kalbfleisch (1998)(https://www.ncbi.nlm.nih.gov/pubmed/9629647) introduced BW estimator,
It can be shown that \(\beta_1\) is the same as the one in the fixed effect model. It is the effect of within individual deviation of \(x\) on within individual deviation of \(y\). \(\beta_2\) is the effect of mean of \(x\) on mean of \(y\), that is, the “between” effect. \(\gamma\) is the effect of time-invariant variable on the mean of \(y\).
This is just some transformation of the original specification, it’s the same model. \(\beta_1\) is exactly the same as before, \(\beta_2\) becomes the difference between “within” and “between” effects. This is called “contextual model”, \(\beta_2\) is the “contextual” effect. See Neuhaus and Kalbfleisch (1998)(https://doi.org/10.1017/psrm.2014.7). In this specification, \(\beta_2\) is acutally similar to a Hausman test. It shows the difference between “between” and “within”.
One advantage of BW model is that it can incorporate fixed effect models along with a random effect estimation, thus including time-invariant covariates becomes possible. A second advantage is that it can do more complicated models, such as cross-level interactions, random slopes, or other multi-level models.
The actual implementation of the simplest form of BW is easy: simply use random effect models on the above two equations.
7.4 BW model in R
R has a package “panelr”(https://panelr.jacob-long.com/articles/wbm.html) that implements various kinds of BW models. Let’s see an example.
library(panelr)data("WageData")wages <-panel_data(WageData, id = id, wave = t)model1 <-wbm(lwage ~ wks + union + ms + occ | blk + fem, data = wages)summary(model1)
MODEL INFO:
Entities: 595
Time periods: 1-7
Dependent variable: lwage
Model type: Linear mixed effects
Specification: within-between
MODEL FIT:
AIC = 2036.78, BIC = 2119.13
Pseudo-R² (fixed effects) = 0.27
Pseudo-R² (total) = 0.69
Entity ICC = 0.57
WITHIN EFFECTS:
------------------------------------------
Est. S.E. t val. p
----------- ------- ------ -------- ------
wks 0.00 0.00 1.06 0.29
union 0.06 0.03 2.53 0.01
ms -0.08 0.03 -2.57 0.01
occ -0.08 0.02 -3.32 0.00
------------------------------------------
BETWEEN EFFECTS:
-------------------------------------------------
Est. S.E. t val. p
------------------ ------- ------ -------- ------
(Intercept) 6.30 0.20 30.85 0.00
imean(wks) 0.01 0.00 2.25 0.02
imean(union) 0.15 0.03 4.67 0.00
imean(ms) 0.17 0.05 3.07 0.00
imean(occ) -0.41 0.03 -13.31 0.00
blk -0.15 0.05 -2.81 0.00
fem -0.32 0.06 -4.96 0.00
-------------------------------------------------
p values calculated using df = 4153
RANDOM EFFECTS:
------------------------------------
Group Parameter Std. Dev.
---------- ------------- -----------
id (Intercept) 0.2992
Residual 0.2589
------------------------------------
Let’s compare this with another popular package “lfe”.
library(lfe)model2 <-felm(lwage ~ wks + union + ms + occ | id, data = wages)summary(model2)
Call:
felm(formula = lwage ~ wks + union + ms + occ | id, data = wages)
Residuals:
Min 1Q Median 3Q Max
-1.89500 -0.16174 0.00652 0.17060 1.94521
Coefficients:
Estimate Std. Error t value Pr(>|t|)
wks 0.001083 0.001019 1.063 0.287816
union 0.064320 0.025378 2.534 0.011305 *
ms -0.082905 0.032226 -2.573 0.010132 *
occ -0.077507 0.023359 -3.318 0.000916 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.2589 on 3566 degrees of freedom
Multiple R-squared(full model): 0.7304 Adjusted R-squared: 0.6852
Multiple R-squared(proj model): 0.006509 Adjusted R-squared: -0.1601
F-statistic(full model):16.16 on 598 and 3566 DF, p-value: < 2.2e-16
F-statistic(proj model): 5.841 on 4 and 3566 DF, p-value: 0.0001106
We can see these two gives the same fixed effect estimation. “panelr” in addition estimates the effect of “blk” and “fem” which are time-invariant. But “lfe” has an advantage, it allows you to estimate fixed effect with clustered standard errors, which I wish “panelr” can do too.
model3 <-felm(lwage ~ wks + union + ms + occ | id |0| id, data = wages)summary(model3)
Call:
felm(formula = lwage ~ wks + union + ms + occ | id | 0 | id, data = wages)
Residuals:
Min 1Q Median 3Q Max
-1.89500 -0.16174 0.00652 0.17060 1.94521
Coefficients:
Estimate Cluster s.e. t value Pr(>|t|)
wks 0.001083 0.001331 0.814 0.4160
union 0.064320 0.040936 1.571 0.1167
ms -0.082905 0.047399 -1.749 0.0808 .
occ -0.077507 0.031320 -2.475 0.0136 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.2589 on 3566 degrees of freedom
Multiple R-squared(full model): 0.7304 Adjusted R-squared: 0.6852
Multiple R-squared(proj model): 0.006509 Adjusted R-squared: -0.1601
F-statistic(full model, *iid*):16.16 on 598 and 3566 DF, p-value: < 2.2e-16
F-statistic(proj model): 3.456 on 4 and 594 DF, p-value: 0.008358
7.5 BW model in Stata
In stata, there is no package to do BW estimator. But we can do it with “xtreg”.
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)
Panel variable: idcode (unbalanced)
Fixed-effects (within) regression Number of obs = 28,510
Group variable: idcode Number of groups = 4,710
R-squared: Obs per group:
Within = 0.1026 min = 1
Between = 0.0877 avg = 6.1
Overall = 0.0774 max = 15
F(1, 4709) = 884.05
corr(u_i, Xb) = 0.0314 Prob > F = 0.0000
(Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
| Robust
ln_wage | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
age | .0181349 .0006099 29.73 0.000 .0169392 .0193306
_cons | 1.148214 .0177153 64.81 0.000 1.113483 1.182944
-------------+----------------------------------------------------------------
sigma_u | .40635023
sigma_e | .30349389
rho | .64192015 (fraction of variance due to u_i)
------------------------------------------------------------------------------
We then generate the mean of age and run a BW estimation.
webuse nlsworkxtset idcodebysort idcode: center age, prefix(d) mean(m)xtreg ln_w age mage i.race, re cluster(idcode)
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)
Panel variable: idcode (unbalanced)
(generated variables: dage mage)
Random-effects GLS regression Number of obs = 28,510
Group variable: idcode Number of groups = 4,710
R-squared: Obs per group:
Within = 0.1026 min = 1
Between = 0.1040 avg = 6.1
Overall = 0.0950 max = 15
Wald chi2(4) = 1335.89
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000
(Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
| Robust
ln_wage | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
age | .0181349 .00061 29.73 0.000 .0169394 .0193304
mage | .0044231 .0012736 3.47 0.001 .001927 .0069192
|
race |
Black | -.1190245 .0127419 -9.34 0.000 -.1439981 -.094051
Other | .0974999 .0617365 1.58 0.114 -.0235014 .2185012
|
_cons | 1.037566 .0323185 32.10 0.000 .9742232 1.100909
-------------+----------------------------------------------------------------
sigma_u | .36581626
sigma_e | .30349389
rho | .59231394 (fraction of variance due to u_i)
------------------------------------------------------------------------------
In this BW model, we have the fixed effect model coefficient on age, which is .0181. The coeffcient on mage (.0044) is the “contextual effect” of between effect of age, that is, the addtional effect of between effect on logged wage. The between effect should be .0044+.0181=.0225. And we have the effect of time-invariant covariate race estimated. The advantage of using xtreg is that we have clustered standard errors implemented.
7.6 BW model in non-linear models
Paul Allison in his blog(https://statisticalhorizons.com/between-within-contextual-effects) mentioned using BW model for a binary outcome. I have not dig into the literature to see how large the bias can be using the BW , comparing to, say a conditional logit model. But if OLS is a good linear approximation of a logit model, BW model could be a good approximation with a binary outcome with panel data.
---title: "Fixed or Random Effect, or Both?"date: "2019-05-23"---## Panel dataWhen we have a panel data (repeated observations over time, or observations clustered at higher level), we usually think of two choices: random effect or fixed effect? Economists usually prefers fixed effect models, since it wipes out all within unit heterogeneity. Economists do not like random effect models since it has a big assumption: the random effects need to be uncorrelated to other covariates in the model. To see this, suppose we have $$ y_{it} = \beta_0 + \beta_1 x_{it} + c_i + \epsilon_{it} $$Suppose we have individuals $i=1, ... , n$ measured at time $t=1, ..., T$. Here $c_i$ is the unobserved time-invariant individual effects. The difference between fixed and random effects is in how they handle $c_i$. Fixed effect models for a linear model can be implemented by one of these two methods: with dummies of individuals, or run an OLS with de-meaned $y$ and $x$. These two methods are equivalent. In a non-linear model, things are more difficult, except Poisson model, other non-lienar model with dummies suffer "incidental parameter" problem. The gold-standard is to do a conditional likelihood (conditional logit for example), which "obsorbs" the fixed effects in the likelihood function, therefore it's not necessary to estimate them. Unfortunately most non-linear models do not have such nice conditional likelihood. In that case we can only hope the bias would be small (it does get smaller when you have deeper panel, that is , number of observations per individual).Random effect models treat $c_i$ as part of the error term. In that case, it comes the biggest drawback: the covariates have to be uncorrelated with the error term to have a consistent estimator. Therefore in the above equation, $x$ has to be uncorrelated with $c_i$, which economists in general do not think it's realistic. ## Time-invariant variablesSometimes people are interested in the effect of time-invariant variables, thus the model$$ y_{it} = \beta_0 + \beta_1 x_{it} + c_i + \gamma z_i+ \epsilon_{it} $$Fixed effect models cannot handle this, because $\gamma$ is not identified because $z_i$ is perfectly collinear with $c_i$. Random effect can still be estimated, treating $z_i$ simply as another covariate. ## Between-within modelUsually we were told to do a Hausman test to see whether we should use fixed effect or random effect model. The basic idea is the random effect is more efficient if the assumptions are satisfied. If not, then fixed effect model is still consistent. The Hausman test is to compare the difference between the two. If the difference is small then stick with random effect. If it's big, then fixed effect should be preferred since it's consistent.However, there is a between-within model (BW) that can incorporate both. Neuhaus and Kalbfleisch (1998)(https://www.ncbi.nlm.nih.gov/pubmed/9629647) introduced BW estimator,$$ y_{it} = \beta_0 + \beta_1 (x_{it} - \bar x_i) + \beta_2 \bar x_i + c_i + \gamma z_i+ \epsilon_{it} $$It can be shown that $\beta_1$ is the same as the one in the fixed effect model. It is the effect of within individual deviation of $x$ on within individual deviation of $y$. $\beta_2$ is the effect of mean of $x$ on mean of $y$, that is, the "between" effect. $\gamma$ is the effect of time-invariant variable on the mean of $y$.The other specification of BW estimator is $$ y_{it} = \beta_0 + \beta_1 x_{it} + \beta_2 \bar x_i + c_i + \gamma z_i+ \epsilon_{it} $$This is just some transformation of the original specification, it's the same model. $\beta_1$ is exactly the same as before, $\beta_2$ becomes the difference between "within" and "between" effects. This is called "contextual model", $\beta_2$ is the "contextual" effect. See Neuhaus and Kalbfleisch (1998)(https://doi.org/10.1017/psrm.2014.7). In this specification, $\beta_2$ is acutally similar to a Hausman test. It shows the difference between "between" and "within". One advantage of BW model is that it can incorporate fixed effect models along with a random effect estimation, thus including time-invariant covariates becomes possible. A second advantage is that it can do more complicated models, such as cross-level interactions, random slopes, or other multi-level models. The actual implementation of the simplest form of BW is easy: simply use random effect models on the above two equations. ## BW model in RR has a package "panelr"(https://panelr.jacob-long.com/articles/wbm.html) that implements various kinds of BW models. Let's see an example.```{r}#| echo: true#| message: falselibrary(panelr)data("WageData")wages <-panel_data(WageData, id = id, wave = t)model1 <-wbm(lwage ~ wks + union + ms + occ | blk + fem, data = wages)summary(model1)```Let's compare this with another popular package "lfe".```{r}#| echo: true#| message: falselibrary(lfe)model2 <-felm(lwage ~ wks + union + ms + occ | id, data = wages)summary(model2)```We can see these two gives the same fixed effect estimation. "panelr" in addition estimates the effect of "blk" and "fem" which are time-invariant. But "lfe" has an advantage, it allows you to estimate fixed effect with clustered standard errors, which I wish "panelr" can do too.```{r}#| echo: true#| message: falsemodel3 <-felm(lwage ~ wks + union + ms + occ | id |0| id, data = wages)summary(model3)```## BW model in StataIn stata, there is no package to do BW estimator. But we can do it with "xtreg".```{r}#| echo: false#| message: falselibrary(Statamarkdown)``````{stata}*| cache: truewebuse nlsworkxtset idcodextreg ln_w age, fe cluster(idcode)```We then generate the mean of age and run a BW estimation.```{stata}*| cache: truewebuse nlsworkxtset idcodebysort idcode: center age, prefix(d) mean(m)xtreg ln_w age mage i.race, re cluster(idcode)```In this BW model, we have the fixed effect model coefficient on age, which is .0181. The coeffcient on mage (.0044) is the "contextual effect" of between effect of age, that is, the addtional effect of between effect on logged wage. The between effect should be .0044+.0181=.0225. And we have the effect of time-invariant covariate race estimated. The advantage of using xtreg is that we have clustered standard errors implemented.## BW model in non-linear modelsPaul Allison in his blog(https://statisticalhorizons.com/between-within-contextual-effects) mentioned using BW model for a binary outcome. I have not dig into the literature to see how large the bias can be using the BW , comparing to, say a conditional logit model. But if OLS is a good linear approximation of a logit model, BW model could be a good approximation with a binary outcome with panel data.