9  Panel Data Models

9.1 Background

Panel data are repeated observations on the same “unit” (we call that cross-sectional units) over time. Panel data models are used since it has advantages over cross-sectional models or time-series models. The major advantage it has over cross-sectional models is: Controlling for individual heterogeneity.

For example, Baltagi and Levin (1992) studies cigarette demand across 46 states for the years 1963-88. Consumption is modeled as a function of lagged consumption, price and income. These variables vary across states and time. However, there are other variables that may be state-invariant (\(z_i\) or time-invariant (\(w_t\)). Examples of \(z_i\) are religion and education. For example, Utah, as a Mormon state, has very low cigarette demand due to religious reason. Generally we consider it does not change over time or change very little over time. Examples of \(w_i\) include cigarette commercials on national TV or radio. Panel data models are able to control for individual heterogeneity (or cross-sectional heterogeneity) while cross-sectional models are not. In many (or most) social science studies, there is “unobserved effects” which is embedded in the error term. In other words, we have “omitted variable” problem. Without controlling for it, the estimation results are generally biased and inconsistent.

Panel data give more informative data, more variability and less collinearity among variables. Often time series data suffers from multicollinearity, while panel data has more variability from its cross-sectional units.

The basic unobserved effects model can be written as: $$ y_{it}={} + c_i + u_{it}

$$

where \(i\) indexes “units” (can be people, firms, or households, etc.), an \(t\) indexes time periods.

Comparing to regular cross-sectional models, there is an extra term \(c_i\). \(c_i\) can be treated as a random effect or fixed effect. Traditionally it is distinguished by whether it is estimated as a random variable or a parameter. Modern econometrics tends to distinguish by the correlation between \(c_i\) and \({\bf {x_{it}}}\). If there is no correlation between \(c_i\) and \({\bf {x_{it}}}\), then it’s a random effect; otherwise, it’s a fixed effect.

It is true that in some cases it is hard to justify that If there is no correlation between \(c_i\) and \({\bf {x_{it}}}\), which is one reason that people in economics tend to use fixed-effect models. However, fixed-effect models have its own limitations. One of them is: It is hard to justify that ALL the cross-sectional units’ characteristics other than those already in the model do not change over time.

9.2 Random Effect Methods

A random effect model puts \(c_i\) into the error term, then estimate by FGLS (feasible GLS). Random effect model is sometimes called “Error-components model” since the overall error term is divided into an individual level error term (\(c_i\)) and individual-time level error term (\(u_{it}\)).

Under a random effect model, the variance-covariance matrix of the error term becomes:

\[ \bf \Omega= \begin{bmatrix} \sigma_{c}^2 + \sigma_u^2 & \sigma_{c}^2 & \cdots & \sigma_{c}^2 \\ \sigma_{c}^2 & \sigma_{c}^2 + \sigma_u^2 & \cdots & \sigma_{c}^2 \\ \vdots & \vdots & \ddots & \vdots \\ \sigma_{c}^2 & \sigma_{c}^2 & \cdots & \sigma_{c}^2 + \sigma_u^2 \end{bmatrix} \]

The random effect estimator is:

\[ \bf \hat \beta_{RE} = (\sum_{i=1}^N X_i' \hat \Omega^{-1} X_i)^{-1} (\sum_{i=1}^N X_i' \hat \Omega^{-1} y_i)\]

It is a special case of GLS. Before calculating \(\hat \beta_{RE}\), $ {c}^2$ and $ {u}^2$ need to be estimated. This is generally done by using pooled OLS estimates.

9.3 Fixed Effect Methods

Random effect methods assume that \(c_i\) be orthogonal to \(\bf x_{it}\). In many applications, the whole point of using panel data is to allow \(c_i\) be correlated with \(\bf x_{it}\). We need fixed-effect models in those cases.

For the same model as in equation the equation above, the fixed-effect methods try to eliminate \(c_i\) using fixed effects transformation, or “within transformation”. The FE transformation is to “de-mean” each observation by subtracting the group mean. Basically subtracting equation the equation above from the following equation:

\[ \bar y_i={\bf \bar x_i \beta} + c_i + \bar u_i \]

where $y_i $, etc., means group means. For example, if we have 50 states with state level data of cigarette consumption across years, then groups means states, and we have 50 group means which are state level average cigarette consumption across years.

The reason for “demeaning” is to remove \(c_i\) from final estimation. The “demeaned” regression equation is the final regression used in estimation: