More models
We cover a couple of additional models.
Threat to identification the linear model
We want to think about what is the estimand that a linear model would recover in a couple of interesting cases.
Omitted variable bias
We consider the following model:
together with E[\epsilon_i | X_i, Z_i] = 0 and we ask what does the regression coefficient of Y_i on X_i alone recovers?
Measurment error bias
We consider the following model:
where we only observe X_i = X^*_{i} + u_i
together with E[\epsilon_i,u_i | X_i] = 0 and we ask what does the regression coefficient of Y_i on X_i alone recovers?
Instrumental variable
Perhaps we are willing to assume that for a given Z_i variable we have that E[U_i | Z_i] = 0 even if it is not true for X_i. In this case we can identify \beta from
assuming that E[Z_i X'_i] is square and invertible. We then define the instrumental variable estimator as
This can address both earlier mentioned issues. Two important assumptions:
- exclusion restriction: E[U_i | Z_i] = 0
- relevance: E[Z_i X'_i] is invertible
We can show consistency and asymptotic normality. But what about unbiasedness?
TBD in class!
2SLS
In more general case, when for instance one has more instruments than regressors. We introduce the 2SLS estimator:
$$ \beta_n^{2SLS} = (X'_n P_n X_n)^{-1} X'_n P_n Y_n $$
where P_n = Z_n ( Z'_n Z_n)^{-1} Z'_n. We see that P_n P_n = P_n and so we can write
and so this is like regressing Y_n on P_n X_n where we note that P_n X_n = Z_n ( Z'_n Z_n)^{-1} Z'_n X_n, is the predicted value from the regression of X_n on Z_n.