第4章n　多元回归估计与假设检验.ppt

资源描述

《第4章n　多元回归估计与假设检验.ppt》由会员分享，可在线阅读，更多相关《第4章n　多元回归估计与假设检验.ppt（88页珍藏版）》请在三一办公上搜索。

1、1,第4章多元回归分析:估计与假设检验 Multiple Regression Analysis,y=b0+b1x1+b2x2+.bkxk+uEstimation and Inference,2,Parallels with Simple Regression,Yi=b0+b1Xi1+b2Xi2+.bkXik+uib0 is still the interceptb1 to bk all called slope parameters,also called partial regression coefficients and any coefficient bj denote the ch

2、ange of Y with the changes of Xj as all the other independent variables fixed.u is still the error term(or disturbance)Still minimizing the sum of squared residuals,so have k+1 first order conditions,3,Obtaining OLS Estimates,4,Obtaining OLS Estimates,cont.,The above estimated equation is called the

3、 OLS regression line or the sample regression function(SRF)the above equation is the estimated equation,is not the really equation.The really equation is population regression line which we dont know.We only estimate it.So,using a different sample,we can get another different estimated equation line

4、.The population regression line is,5,Interpreting Multiple Regression,6,An Example(Wooldridge,p76),The determination of wage(dollars per hour),wage:Years of education,educYears of labor market experience,experYears with the current employer,tenureThe relationship btw.wage and educ,exper,tenure:wage=

5、b0+b1educ+b2exper+b3tenure+uThe estimated equation as below:wage=-2.873+0.599educ+0.022exper+0.169tenure,7,A“Partialling Out”Interpretation,8,A“Partialling Out”Interpretation,9,“Partialling Out”continued,Previous equation implies that regressing Y on X1 and X2 gives same effect of X1 as regressing Y

6、 on residuals from a regression of X1 on X2This means only the part of Xi1 that is uncorrelated with Xi2 are being related to Yi,so were estimating the effect of X1 on Y after X2 has been“partialled out”,10,The wage determinations,The estimated equation as below:wage=-2.873+0.599educ+0.022exper+0.16

7、9tenureNow,we first regress educ on exper and tenure to patial out the exper and tenures effects.Then we regress wage on the residuals of educ on exper and tenure.Whether we get the same result.?educ=13.575-0.0738exper+0.048tenure denote residuals residwage=5.896+0.599residWe can see that the coeffi

8、cient of resid is the same of the coefficien of the variable educ in the first estimated equation.So is in the second equation.,11,Goodness-of-Fit:R2,12,Goodness-of-Fit,13,Goodness-of-Fit(continued),How do we think about how well our sample regression line fits our sample data?Can compute the fracti

9、on of the total sum of squares(SST)that is explained by the model,call this the R-squared of regression R2=ESS/TSS=1 RSS/TSS,14,Goodness-of-Fit(continued),15,More about R-squared,R2 can never decrease when another independent variable is added to a regression,and usually will increaseBecause R2 will

10、 usually increase with the number of independent variables,it is not a good way to compare models,16,An Example,Using wage determination model to show that when we add another new independent variable will increase the value of R2.,17,Adjusted R-Squared,R2 is simply an estimate of how much variation

11、 in y is explained by X1,X2,Xk.That is,Recall that the R2 will always increase as more variables are added to the modelThe adjusted R2 takes into account the number of variables in a model,and may decrease,18,Adjusted R-Squared(cont),Most packages will give you both R2 and adj-R2 You can compare the

12、 fit of 2 models(with the same Y)by comparing the adj-R2wge=-3.391+0.644educ+0.070exper adj-R2=0.2222wge=-2.222+0.569educ+0.190tenure adj-R2=0.2992 You cannot use the adj-R2 to compare models with different ys(e.g.y vs.ln(Y)wge=-3.391+0.644educ+0.070exper adj-R2=0.2222log(wge)=0.404+0.087educ+0.026e

13、xper adj-R2=0.3059Because the variance of the dependent variables is different,the comparation btw them make no sense.,19,Assumptions for Unbiasedness,20,Assumptions for Unbiasedness,Population model is linear in parameters:Y=b0+b1X1+b2X2+bkXk+uWe can use a sample of size n,(Xi1,Xi2,Xik,Yi):i=1,2,n,

14、from the population model,so that the sample model is Yi=b0+b1Xi1+b2Xi2+bkXik+ui Cov(uXi)=0,E(uXi)=0,i=1,2,n.E(u|X1,X2,Xk)=0,implying that all of the explanatory variables are exogenous.E(u|X)=0,where X=(X1,X2,Xk),which will reduce to E(u)=0 if independent variables X are not random variables.None o

15、f the Xs is constant,and there are no exact linear relationships among them.,The new additional assumption.,21,About multicollinearity,It does allow the independent variables to be correlated;they just cannot be perfectly linear correlated.Student performance:colGPA=b0+b1 hsGPA+b2ACT+b3 skipped+uCon

16、sumption function:consum=b0+b1inc+b2inc2+uBut,the following is invalid:log(consum)=b0+b1log(inc)+b2log(inc2)+uIn this case,we can not estimate the regression coefficients b1,b2.,22,Unbiasedness of OLS estimation,Under the three assumptions above,we can get,23,Too Many or Too Few Variables,24,Too Man

17、y or Too Few Variables,What happens if we include variables in our specification that dont belong?There is no effect on our parameter estimate,and OLS remains unbiasedWhat if we exclude a variable from our specification that does belong?OLS will usually be biased,25,Omitted Variable Bias,26,Omitted

18、Variable Bias(cont),27,Omitted Variable Bias(cont),28,Omitted Variable Bias(cont),There are two cases where the estimated parameter is unbiased:If b2=0,so that X2 does not appear in the true modelIf tilde of d1=0,the tilde b1 is unbiased for b1,29,Summary of Direction of Bias,30,An example,The estim

19、ated equation as below:wage=-3.391+0.644educ+0.070experthe correlation between educ and expercorr(educ,exper)=-0.2295Estimate them again without experwage=-0.905+0.541educ,31,Omitted Variable Bias Summary,Two cases where bias is equal to zerob2=0,that is X2 doesnt really belong in modelX1 and X2 are

20、 uncorrelated in the sample If correlation between X2,X1 and X2,Y is the same direction,bias will be positive If correlation between X2,X1 and X2,Y is the opposite direction,bias will be negative,32,The More General Case,When there are multiple regressors in the estimated model,to derive the omitted

21、 variable bias is more difficult.Because correlation btw a single explanatory variable and the error generally results in all OLS estimators being biased.,33,Variance of the OLS Estimators,34,Variance of the OLS Estimators,Now we know that the sampling distribution of our estimate is centered around

22、 the true parameter Want to think about how spread out this distribution is Much easier to think about this variance under an additional assumption,soAssume Var(u|X1,X2,Xk)=s2,or Var(u)=s2(Homoskedasticity),35,Variance of OLS(cont),Yi=b0+b1Xi1+b2Xi2+.bkXik+uiLet X stand for(X1,X2,Xk)Assuming that Va

23、r(u|X)=s2 also implies that Var(Y|X)=s2,we can rewrite them as Var(u)=s2 and Var(Y)=s2 if X are not random variables.,36,Variance of OLS(cont),VIF,37,Variance of OLS(cont),38,Components of OLS Variances,The error variance:a larger s2 implies a larger variance for the OLS estimatorsThe total sample v

24、ariation:a larger TSSj implies a smaller variance for the estimatorsLinear relationships among the independent variables:a larger Rj2 implies a larger variance for the estimatorsIf Rj21,then the variances of the estimated parameters will become infinity.Rj21 means there are multicollinearity among t

25、he independent variables.,39,Error Variance Estimate(cont),df=n(k+1),or df=n k 1 df(i.e.degrees of freedom)is the(number of observations)(number of estimated parameters)Attention:the difference btw sd()and se(),40,Assumption for Serial Correlation,There is no serial correlation(auto correlation)betw

26、een any ui and uj.Cov(ui,uj)=0,i j.The 3 assumptions for unbiasedness,plus the homoskedasticity assumption and this no serial correlation assuption are known as the Gauss-Markov assumptions.,41,The Gauss-Markov Theorem,Given our 5 Gauss-Markov Assumptions it can be shown that OLS is“BLUE”Best:smalle

27、st variance Linear Unbiased Estimator Thus,if the assumptions hold,use OLS,42,The Multiple Regression Model in Matrix Form,43,The Multiple Regression Model in Matrix Form,44,The Multiple Regression Model in Matrix Form,cont.,45,The Multiple Regression Model in Matrix Form,cont.,46,The Multiple Regre

28、ssion Model in Matrix Form,cont.,47,Multiple Regression Analysis:Inference,48,Assumptions of the Classical Linear Model(CLM),So far,we know that given the Gauss-Markov assumptions,OLS is BLUE,In order to do classical hypothesis testing,we need to add another assumption(beyond the Gauss-Markov assump

29、tions)Assume that u is independent of X1,X2,Xk and u is normally distributed with zero mean and variance s2:u Normal(0,s2),49,CLM Assumptions(cont.),Under CLM,We can summarize the population assumptions of CLM as followsY|X Normal(b0+b1x1+bkxk,s2)While for now we just assume normality,clear that som

30、etimes not the caseLarge samples will let us drop normality,50,.,.,x1,x2,The homoskedastic normal distribution with a single explanatory variable,E(y|x)=b0+b1x,y,f(y|x),Normaldistributions,51,Normal Sampling Distributions,52,The t Test,53,The t Test(cont),Knowing the sampling distribution for the st

31、andardized estimator allows us to carry out hypothesis testsStart with a null hypothesisFor example,H0:bj=0If accept null,then accept that Xj has no effect on Y,controlling for other Xs,54,The t Test(cont),55,t Test:One-Sided Alternatives,Besides our null,H0,we need an alternative hypothesis,H1,and

32、a significance level H1 may be one-sided,or two-sided H1:bj 0 and H1:bj 0 are one-sided H1:bj 0 is a two-sided alternative If we want to have only a 5%probability of rejecting H0 if it is really true,then we say our significance level is 5%,56,One-Sided Alternatives(cont),Having picked a significanc

33、e level,a,we look up the(1 a)th percentile in a t distribution with n k 1 df and call this c,the critical value We can reject the null hypothesis if the t statistic is greater than the critical valueIf the t statistic is less than the critical value then we fail to reject the null,57,yi=b0+b1xi1+bkx

34、ik+uiH0:bj=0 H1:bj 0,One-Sided Alternatives(cont),58,An Example:Hourly Wage Equation,Wage determination:(wooldridge,p123)log(wge)=0.284+0.092educ+0.0041exper+0.022tenure(0.104)(0.007)(0.0017)(0.003)n=526 R2=0.316Whether the return to exper,controlling for educ and tenure,is zero in the population,ag

35、ainst the alternative that it is positive.H0:bexper=0 vs.H1:bexper 0The t statistic is t=0.0041/0.00172.41The degree of freedom:df=n-k-1=526-3-1=522The critical value of 5%is 1.645And the t statistic is larger than the critical value,ie.,2.411.645That is,we will reject the null hypothesis and bexper

36、 is really positive.,59,Another example:Student Performance and School Size,Whether the school size has effect on student performance?math10,math test scores,reveal the student performancetotcomp,average annual teacher compensationstaff,the number of staff per one thousand studentsenroll,student enr

37、ollment,reveal the school size.The Model Equationmath10=b0+b1totcomp+b2staff+b3enroll+uH0:b3=0,H1:b3-1.645,so we cant reject the null hypothesis.,60,One-sided vs Two-sided,Because the t distribution is symmetric,testing H1:bj c,then we fail to reject the nullFor a two-sided test,we set the critical

38、value based on a/2 and reject H0:bj=0 if the absolute value of the t statistic c,61,yi=b0+b1Xi1+bkXik+uiH0:bj=0 H1:bj 0,c,0,a/2,(1-a),-c,a/2,Two-Sided Alternatives,reject,reject,fail to reject,62,Summary for H0:bj=0,Unless otherwise stated,the alternative is assumed to be two-sidedIf we reject the n

39、ull,we typically say“Xj is statistically significant at the 100a%level”If we fail to reject the null,we typically say“Xj is statistically insignificant at the 100a%level”,63,An Example:Determinants of College GPA(wooldridge,p128),Variables:colGPA,college GPAskipped,the average number of lectures mis

40、sed per weekACT,achievement test scorehsGPA,high school GPAThe estimated modelolGPA=1.39+0.412 hsGPA+0.015 ACT 0.083 skipped(0.33)(0.094)(0.011)(0.026)n=141,R2=0.234H0:bskipped=0,H1:bskipped 0fd:n-k-1=137,the critical value t137=1.96The t statistic is|-0.083/0.026|=3.19 t137=1.96,so we will reject t

41、he null hypothesis and the bskipped is signanificantly beyond zero.,64,Testing other hypotheses,A more general form of the t statistic recognizes that we may want to test something like H0:bj=aj In this case,the appropriate t statistic is,65,An Example:Campus Crime and Enrollment(wooldridge,p129),Va

42、riablescrime,the annual number of crimes on college campusesenroll,student enrollment,reveal the size of college.The regression modellog(crime)=b0+b1log(enroll)+uWhether b1=1,that is H0:b1=1,H1:b1 1log(crime)=-6.63+1.27 log(enroll)(1.03)(0.11)n=97 R2=0.585df:n-k-1=95,the critical value at 5%is t95=1

43、.645The t-statistic is(1.27-1)/0.112.45t95=1.645So we reject the null hypothesis and the evidence prove that b1 1.,66,Confidence Intervals,Another way to use classical statistical testing is to construct a confidence interval using the same critical value as was used for a two-sided test A 100(1-a)%

44、confidence interval is defined as,67,Computing p-values for t tests,An alternative to the classical approach is to ask,“what is the smallest significance level at which the null would be rejected?”So,compute the t statistic,and then look up what percentile it is in the appropriate t distribution thi

45、s is the p-valuep-value is the probability we would observe the t statistic we did,if the null were true,68,Stata and p-values,t tests,etc.,Most computer packages will compute the p-value for you,assuming a two-sided testIf you really want a one-sided alternative,just divide the two-sided p-value by

46、 2Stata provides the t statistic,p-value,and 95%confidence interval for H0:bj=0 for you,in columns labeled“t”,“P|t|”and“95%Conf.Interval”,respectively,69,Testing a Linear Combination,Suppose instead of testing whether b1 is equal to a constant,you want to test if it is equal to another parameter,tha

47、t is H0:b1=b2,or b1-b2=0Use same basic procedure for forming a t statistic,70,Testing Linear Combination(cont),71,Testing a Linear Combo(cont),So,to use formula,need s12,which standard output does not haveMany packages will have an option to get it,or will just perform the test for youIn Stata,after

48、 reg Y X1 X2 Xk you would type test X1=X2 to get a p-value for the testMore generally,you can always restate the problem to get the test you want,72,Example:,Suppose you are interested in the effect of campaign expenditures on outcomesModel is voteA=b0+b1log(expendA)+b2log(expendB)+b3prtystrA+uH0:b1

49、=-b2,or H0:q1=b1+b2=0b1=q1 b2,so substitute in and rearrange voteA=b0+q1log(expendA)+b2log(expendB)log(expendA)+b3prtystrA+u,73,Example(cont):,This is the same model as originally,but now you get a standard error for b1 b2=q1 directly from the basic regressionAny linear combination of parameters cou

50、ld be tested in a similar mannerOther examples of hypotheses about a single linear combination of parameters:b1=1+b2;b1=5b2;b1=-1/2b2;etc,74,Multiple Linear Restrictions,Everything weve done so far has involved testing a single linear restriction,(e.g.b1=0 or b1=b2)However,we may want to jointly tes

展开阅读全文

第4章n 多元回归估计与假设检验.ppt

第4章n　多元回归估计与假设检验.ppt