93

« Last post by **Youngmin Kim ** on* April 10, 2017, 06:41:35 PM* »
1. Let's start from var(beta1_hat) = E[(beta1_hat - E[beta1_hat])^2]: as you notice, this corresponds to a variance of the estimator, beta1_hat (by the "definition" of the variance). No assumptions required so far.

2. Adding assumption 1 and 2 allows us to reach a crucial property of beta1_hat, an unbiasedness: E[beta1_hat] = beta1. Furthermore, if we have assumption 3, the OLS estimator turns out to be the "best" (a.k.a, efficiency, or also referred to as "BLUE": best-linear unbiased estimator). This is guranteed by Gauss-Markov theorem.

Now, armed with assumption 1,2,3, variance of beta1_hat (which we start from the general form, var(beta1_hat) = E[(beta1_hat - E[beta1_hat])^2] ) can be written more specifically as follows:

var(beta1_hat) = (sigma^2/n)/(var(X1)*(1-R1^2))

where sigma^2 is "estimated" from the following: 1/(n-k-1) * (Σ u_i^2) (be aware of the fact that (1) u_i is "residual" of the regression after the regression has been run, (2) k is # of regressors, or # of X variables in the model)

Unlike the previous general formula, we definitely have takeaways from this specific form: variance of the estimates beta1_hat depends on (A) variance of unobservables (under the assumption 3: homoskedasticity), (B) # of samples in hand, (C) variance of X1, and (D) degree of colinearity of X1 with respect to other observables (other X's) we explicity specify in the regression model.

Finally, what if we only have assumption 1 and 2 (i.e., breakdown of 3: heteroskedasticity)? It turns out that the same generic form of the variance of beta1_hat is written as:

var(beta1_hat) = ( 1/(n-k-1) * ( Σ[r1_i^2*u_i^2] / n) ) / ( 1/n * Σ[r1_i^2] )^2

As you clearly see, this explicity takes into account the fact that sigma_i differs across individuals i (breakdown of assumption 3): roughly speaking, when estimating the variance in this case, we not only use u_i (residual), but also use additional individual-level information that comes from r1_i to back out the heteroskedasticity-adjusted standard errors of the estimator (and eventually, estimates) of beta1_hat.

Lastly, what happens to "heteroskedasticity-adjusted" var(beta1_hat) if we put back assumption 3? (sigma_i = sigma holds "truely" for all individuals)

Well, it requires some algebra, but let me provide an intuition (guess this is enough):

(1) demoninator: 1/n * Σ[r1_i^2] = var(X1)*(1-R1^2) as r1 is residual after regressing X1 on X2, ... , Xk (pure variation of X1, after partialling out potential correlation of X1 with respect to other controls). Apparently, mean of r1_i is zero, so it maps well to the generic variance formula that has been mentioned above.

(2) numerator: if we truly have homoskedasticity, the numerator would converge to (don't need to know what this term means, but just bear with me for the intuition), ( 1/(n-k-1) * Σ[u_i^2] ) * ( 1/n * Σ[r1_i^2]).

(3) But notice that 1/n * Σ[r1_i^2] has been multiplied twice in the denominator.

(4) Finally, we have: var(beta1_hat) = ( 1/(n-k-1) * (Σ u_i^2) / n )/(var(X1)*(1-R1^2))

Now, compare with the formula of var(beta1_hat) under the assumption 1,2,3: same, but only under the circumstance that assumption 3 is true in our data.

What if we use the formula for var(beta1_hat) under homoskedasticity despite the fact that assumption 3 no longer holds any more? The "estimated" variance for beta1_hat is incorrect, which leads to problems for hypothesis testing, conficence intervals, and etc.

I strongly encourage you to read lecture note 15 for related details with heteroskedasticity.

Hope this helps.