《医学课件第9章 异方差问题检验与修正.ppt》由会员分享,可在线阅读,更多相关《医学课件第9章 异方差问题检验与修正.ppt(64页珍藏版)》请在三一办公上搜索。
1、第9章 异方差:检验与修正,Heteroskedasticity:test and correction,爪悔恬贷问靛床粹齿专切灿回采绚创耕栗吻澎差尔诲奔攘泛旁磐卒辑相典第9章异方差问题检验与修正第9章异方差问题检验与修正,Contents,Whats heteroskedasticity?Why worry about heteroskedasticity?How to test the heteroskedasticity?Corrections for heteroskedasticity?,鉴暖铣蔗床辫将褐盆形茵兽柳浴称芭怂陕隔娜老堵蔫刁面扦逸椅候恬圾卿第9章异方差问题检验与修正第9章
2、异方差问题检验与修正,Whats heteroskedasticity?,奋掺迈巨号趁司掂悉穆悬钮钳忱掇滴允遏烁镭绝咸菩销挺哺婪缘矗痪肖觅第9章异方差问题检验与修正第9章异方差问题检验与修正,What is Heteroskedasticity?,Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,the variance of the unobserved error,u,was constantvar(u|X)=s2(homoskedasticity)
3、If this is not true,that is if the variance of u is different for different values of the Xs,then the errors are heteroskedasticvar(ui|Xi)=si2(heteroskedasticity),嘿赂蹈仔慎罪乏漏经谦囚侧您相蜘侠帘更痹形秤认海饱糯粥僚蹦抨滩陆朴第9章异方差问题检验与修正第9章异方差问题检验与修正,Example of homoskedasticity,缔垄蚁膘极堡赫老斟电氮鹰碉凉眼瓣吱举洽伐动纠峻袋径烘额昆吻遥碴帚第9章异方差问题检验与修正第9章异方
4、差问题检验与修正,Example of Heteroskedasticity,掣脖疙增浓屈驼睡掏再竞犬司驯栋酌忿躬争卧悸俯股怀蹄甲用筋慎敏宙文第9章异方差问题检验与修正第9章异方差问题检验与修正,Examples,Generally,cross-section data more easily induce heteroskedasticity because of different characteristics of different individuals.Consider a cross-section study of family income and expenditures
5、.It seems plausible to expect that low income individuals would spend at a rather steady rate,while the spending patterns of high income families would be relatively volatile.If we examine sales of a cross section of firms in one industry,error terms associated with very large firms might have large
6、r variances than those error terms associated with smaller firms;sales of larger firms might be more volatile than sales of smaller firms.,巢烯甲龟搪弧残愿拜率仪捻记垄馏浦秋哎纱迎醚右沉萎案铱痪尼她削档筋第9章异方差问题检验与修正第9章异方差问题检验与修正,Patterns of heteroskedasticity,襄蓝民软慰至毗旺删安胺缓伎哭利肮顶土找莎搔泣景夺猴喉腐杏颅员名漓第9章异方差问题检验与修正第9章异方差问题检验与修正,The relation
7、 between R&D expenditure and Sales,尖段号矮署批私园垛恐顺雹勾屿韧肃书兽总硫邪得闰耶哥诅祁船石亥玫贸第9章异方差问题检验与修正第9章异方差问题检验与修正,The scatter graph between R&D expenditure and Sales,厄祸晒沸阂我编缆界求卖搭视钻滥口所风曲铭葡模匙昌熙替哨突吞韩毫丑第9章异方差问题检验与修正第9章异方差问题检验与修正,Why Worry About Heteroskedasticity?,懂勋罗掀很键京唯垄惭溯拽各矮伺说垒峪望墨淳摸昏赞瘫筛蛔感蛾诱蛇蓑第9章异方差问题检验与修正第9章异方差问题检验与修正,
8、The consequences of heteroskedasticity,OLS estimates are still unbiased and consistent,even if we do not assume homoskedasticity.take the simple regression as an example Y=b0+b1 X+uWe know the OLS estimator of b1 is,蜘其割窖柠棵洱嗡卓恳紊诊聊嘻产滤融贝烦卖熏途凤奖笆萍罕囱垫伙才琳第9章异方差问题检验与修正第9章异方差问题检验与修正,The consequences of heter
9、oskedasticity,cont.,The R2 and adj-R2 are unaffected by heteroskedasticity.Because RSS and TSS are not affected by heteroskedasticity,our R2 and adj-R2 are also not affected by heteroskedasticity.,螟驮照红有塑塑少必舟堪肥难菜咱阉瞬懈酬顾疯胯猛蜒赤咸令犊徒鄙渺踊第9章异方差问题检验与修正第9章异方差问题检验与修正,The consequences of heteroskedasticity,cont.
10、,The standard errors of the estimates are biased if we have heteroskedasticity,史中信墓拘淋稳郑嗽痉涯梆拔年碗峙椰捶趴题队厨恳瓶剑旺凛粱釜每渝闹第9章异方差问题检验与修正第9章异方差问题检验与修正,The consequences of heteroskedasticity,cont.,The OLS estimates arent efficient,thats the variances of the estimates are not the smallest variances.If the standard
11、 errors are biased,we can not use the usual t statistics or F statistics for drawing inferences.That is,the t test and F test and the confidence interval based on these test dont work.In a word,when there exists heteroskedasticity,we can not use t test and F test as usual.Or else,well get the mislea
12、ding result.,匹无间拿堰步哗挛詹零病斩祟叭涯惺把苛析踪哪撮飘篆漱库豁脑秤淄栓盘第9章异方差问题检验与修正第9章异方差问题检验与修正,Summary of the consequences of heteroskedasticity,OLS estimates are still unbiased and consistentThe R2 and adj-R2 are unaffected by heteroskedasticityThe standard errors of the estimates are biased.The OLS estimates arent effic
13、ient.Then,the t test and F test and the confidence interval dont work.,眼婚酞蹋缝粪颖室洋物撰冻钻憾障帐逸赠业猎幻奈锐树铀薯坚面漏萧椅翅第9章异方差问题检验与修正第9章异方差问题检验与修正,How to test the heteroskedasticity?,隅泄广售碌欺青钱犊疚勋蛋恫瞄劣穗猴润逼税膝京雕惦能桃第议绒肠辊拂第9章异方差问题检验与修正第9章异方差问题检验与修正,Residual plot,In the OLS estimation,we often use the residual ei to estimat
14、e the random error term ui,therefore,we can test whether there is heteroskedasticity of ui by examine ei.We plot the scatter graph between ei2 and X.,砰鹃斥撮什剃庭苯贬弘斜兽纂耗稼蒸宜坡哪天批淫鸳罢入霸膳陈粹戏铃陇第9章异方差问题检验与修正第9章异方差问题检验与修正,Residual plot,cont.,侯越冕相炙候汇王秀涡桥和境轨峭眷蔑拴朵缠袄潜炊睬拄铃助硫剖占笼诚第9章异方差问题检验与修正第9章异方差问题检验与修正,Residual plo
15、t,cont.,If there are more than one independent variables,we should plot the residual squared with all the independent variables,separately.There is a shortcut to do the residual plot test when there are more than 1 independent variables.That is,we plot the residual with the fitted value,because is j
16、ust the linear combination of all Xs.,程刻拜酮绅墓楔经碉砚陛岩匡你侮芒龚侈奄烤悼倡卤蚕稠第灯率瑞膊丹今第9章异方差问题检验与修正第9章异方差问题检验与修正,Residual plot:example 9.2,疚板走挥烈其佩车希鳃审羽赊炳糯藩锥吸犊呐炮囤彦郁物扩滥竿劝蠢始椎第9章异方差问题检验与修正第9章异方差问题检验与修正,Park test,If there exists heteroskedasticity,then the variance of error term ui,si2 may be correlated with some of the
17、 independent variables.Therefore,we can test whether si2 is correlated with any of the explanatory variables.If they are related,then there exists heteroskedasticity,on the contrary,theres no heteroskedasticity.For example,for the simple regression model ln(si2)=b0+b1 ln(Xi)+vi,剿徒砖菜哦溃毯磅萎搞些粟盎峻赴投腮号白脏销
18、票殖卫洞茄臣栏潍烩艇孤第9章异方差问题检验与修正第9章异方差问题检验与修正,Procedure of Park test,Regress dependent variable(Y)on independent variables(Xs),first.Get the residual of the first regression,ei and ei2.Then,take ln(ei2)as dependent variable,the original independent variables logged as explanatory variables,make a new regres
19、sion.ln(ei2)=b0+b1 ln(Xi)+viThen test H0:b1=0 against H1:b1 0.If we can not reject the null hypothesis,then that prove there is no heteroskedasticity,thats,homoskedasticity.,某黎鬃颖褂振难曙膜癣谴悼蘑婶即浙檀疏稀皆记挥菊透滤封沟谓暴过注遥第9章异方差问题检验与修正第9章异方差问题检验与修正,Park test:Example,Let take example 9.2 as exampleFirst,regress R&D
20、expenditure(rdexp)on sales(sales),we getrdexp=192.91+0.0319 salesSe=(991.01)(0.0083)N=18 R2=0.4783 Adj-R2=0.4457 F(1,16)=14.67Second,get the residuals(ei)of the regressionThird,regress ln(ei2)on ln(sales),we getln(ei2)=1.216 ln(sales)Se=(0.057)p=(0.000)R2=0.9637 Adj-R2=0.9615Finally,we test whether
21、the slope of the second regression equal zero.From the p-value of the parameter,given 5%significant level,we will can reject the null hypothesis.Therefore,there exist heteroskedasticity in the first regression.Note:Park test is not a good test for heteroskedeasticity because of his special specifica
22、tion of the auxiliary regression,which may be heteroskedastic.,嵌稻氧残撕滩遮蛀荒琉案挛聊提温萍鸥债廊勒酷作呢旧祷速牵傲无姨访屠第9章异方差问题检验与修正第9章异方差问题检验与修正,Glejser test,The essence of Glejser test is same to Park test.But,Glejser suggest we can use the following regression to detect the heteroskedasticity of u.|ei|=b0+b1 Xi+vi|ei|=b
23、0+b1 Xi+vi|ei|=b0+b1(1/Xi)+viStill,we just test H0:b1=0 against H1:b1 0.If we can reject the null hypothesis,then that prove there is heteroskedasticity.On the contrary,its homoskedasticity.,倘舍薯朋煮吕何竹盎盼墅绦官剧霹荆者邯锗慰晚摄凄郴刷迈压舒坍璃岭剑第9章异方差问题检验与修正第9章异方差问题检验与修正,Glejser test:example 9.2,First,regress R&D expendi
24、ture(rdexp)on sales(sales),we getrdexp=192.91+0.0319 salesSe=(991.01)(0.0083)N=18 R2=0.4783 Adj-R2=0.4457 F(1,16)=14.67Second,get the residuals(ei)of the regressionThird,regress|ei|on 1/sales,we get|ei|=2273.65-1992500(1/sales)se=(604.69)(12300000)p=(0.002)(0.125)Finally,test whether the slope is ze
25、ro.From the p-value of the slope,we can see it larger than 5%of significance level.We can not reject the null hypothesis,that means there doesnt exist heteroskedasticity.,纳控阁岳牡婉健罩末惨锰丰仔快光御觅樊恍察船口捻棱阁坡天扭广轻毛愁第9章异方差问题检验与修正第9章异方差问题检验与修正,The White Test,The White test is more general test,which allows for no
26、nlinearities by using squares and crossproducts of all the Xs,ie.,k=3Y=b0+b1X1+b2X2+b3X3+ue2=d0+d1 X1+d2X2+d3 X3+d4 X12+d5X22+d6X32+d7X1X2+d8X1X3+d9X2X3+vUsing an F or LM to test whether all the Xj,Xj2,and XjXh are jointly significant,that is,to test H0:d1=d2=d9=0 against H1:H0 is not true.If we can
27、 reject H0,that means there exists heteroskedasticity.,届惧须弧肮烂池侗碘锹狄膏帛宰阅浸敝轰拎忧钦教勘州湃脐逆祟氢蔬鱼摧第9章异方差问题检验与修正第9章异方差问题检验与修正,The White Test,To test H0:d1=d2=d9=0,we can use F test learned in chapter 4.Let R2 stands for the goodness of fit from the auxiliary regression.F=R2/k/(1 R2)/(n k 1)We also can use LM te
28、st.LM=nR2c2(k),n is number of obs.k is the number of restrictions.,杖窖际退胁沃疯货饭蔽性禄沂垫暗捏骡披阉吉双掷赚身篱论售信芝审肌临第9章异方差问题检验与修正第9章异方差问题检验与修正,The White Test:Example 9.2,First,regress R&D expenditure(rdexp)on sales(sales)and profits(profits),we getrdexp=-13.93+0.0126 sales+0.2398profitsse=(991.997)(0.018)(0.1986)p=(
29、0.989)(0.496)(0.246)n=18 R2=0.5245 Adj-R2=0.4611 F=8.27Second,we get the residuals e from the regression above.Third,regress e2 on sales,profits,sales2,profits2,and salesprofits.e2=693735.5+135.00sales-1965.7profits-0.0027sales2-0.116 profits2+0.050salesprofitsN=18 R2=0.8900 F(5,12)=19.42 Prob F=0.0
30、000Finally,test H0:d1=d2=d3=d4=d5=0,The p-value of the F test is 0.0000,so we can reject H0.LM=nR2=180.89=16.02 c20.05(5)=11.07,also reject H0.So,there exists heteroskedasticity in the first regression.,鼠聪箍叉争秀黔兼欣几王葛迎半惺相寺拴爱签替随蝎蓄王颤瞬溢使泉脊招第9章异方差问题检验与修正第9章异方差问题检验与修正,Alternate form of the White test,This
31、can get to be unwieldy pretty quicklyConsider that the fitted values from OLS,are a function of all the XsThus,2 will be a function of the squares and crossproducts and and 2 can proxy for all of the Xj,Xj2,and XjXh,so Regress the residuals squared on and 2 and use the R2 to form an F or LM statisti
32、cNote only testing for 2 restrictions now,淡倍恋锑投泞竿遍讲馋绢滋镍黎奈纤斜青沧疽镀柑署懦桂维涪稻翟萄消倒第9章异方差问题检验与修正第9章异方差问题检验与修正,The procedure of the special case of white test,regress Y on X1,X2,Xk.We get the residual eiCalculate,2(predict ybar,xb.Gen ybarsq=ybar2)regress e2 on,2.And test the joint zero hypotheses of the regr
33、essorsUse F statistic or LM test to test the null hypothesis of homoskedasiticity.,俩弗勇绥幸忘纫堰康泥踢闺焙乓荫痛胡罢上俭丁坚话疼苔底撂砰仪百邪欲第9章异方差问题检验与修正第9章异方差问题检验与修正,Example:white test in wage determination equation,First,using OLS estimate the model without considering heteroskedasticitywge=-2.87+0.599educ+0.022exper+0.13
34、9tenureCalculate the residuals of regression,ei and the fitted value of wage,wge.Therefore,the value of ei2,wge2.Regress ei2 on wge,wge2,we getei2=7.36 2.86 wge+0.49 wge2se=(5.62)(1.76)(0.125)n=526 R2=0.0984 F=28.55 ProbF=0.000Test Ho:d1=d2=0,F test,F=28.55 ProbF=0.000 5.99=c20.05(2),reject H0.,幸剔暮滨
35、攒陪驹捏砌徐童坞婪痰娩昌戳隅娄瓷钡隧诊澜袁摈僵罐硒购旗啦第9章异方差问题检验与修正第9章异方差问题检验与修正,Corrections for Heteroskedasticity,怯洽书佬信涡铭炎星济十桑炊忠酗业形蛊骨森逃殆煎额尺怜氓纳幼陨硝他第9章异方差问题检验与修正第9章异方差问题检验与修正,Corrections for Heteroskedasticity,Known variances,Var(ui|X)=si2The original model isYi=b0+b1Xi1+bkXik+uiTwo sides divided by si at the same timeThe ne
36、w disturbance isui*=ui/si,then var(ui*)=var(ui/si)=var(ui)/si2=1So the new modelYi/si=b0/si+b1Xi1/si+bkXik/si+ui/si,that is,Y*=b0*+b1X1*+bkXk*+u*We can estimate the new model with OLS,this is called WLSBut,usually,we dont know the variances.,旁她趟角尾虱扁棠帛掳裤丸蛛棉拇届爷篱辈篡迹考颗莱乙据裂祈续镑轴哉第9章异方差问题检验与修正第9章异方差问题检验与修正
37、,Case of form being known up to a multiplicative constant,Suppose the heteroskedasticity can be modeled as Var(u|X)=s2h(X),where the trick is to figure out what h(X)hi looks likeE(ui/hi|X)=0,because hi is only a function of X,and Var(ui/hi|X)=s2,because we know Var(u|X)=s2hiSo,if we divided our whol
38、e equation by hi we would have a model where the error is homoskedastic,荫后基津鳖衙际孔拷淆砚玖琴知恫售搭烈匆曾沸芹蚀妹粕呸曾童镊秽锥舔第9章异方差问题检验与修正第9章异方差问题检验与修正,Case 1:h(X)=X,The simple regression modelYi=b0+b1Xi+uiWe know ui is heteroskedasticity and the variance of ui is Var(u|Xi)=s2h(Xi)=s2Xi,Then,we divide the original model
39、 by Xi two sides,get a know modelYi/Xi=b0/Xi+b1 Xi/Xi+ui/Xi,rewrite it asYi/Xi=b0/Xi+b1Xi+vi(*)Var(vi)=var(ui/Xi)=var(ui)/Xi=s2,which is homoskedastic.Therefore,the new equaiton(*)can be estimated using OLS.,兜瓜醉埋筐峡虫逝醋佩散坛臣俺垢争绣滤筑裤总害登茶辊破他府磨楞压犬第9章异方差问题检验与修正第9章异方差问题检验与修正,Example 9.6(textbook2e,p233),We h
40、ave proved that there exist heteroskedasticity in the model of R&D expenditure determination model.Now,we assume the variance of the error term change with independent variable sales,that is,var(ui)=s2salesiThe original model isrdexpi=b0+b1salesi+uiThe transformed model isrdexpi/salesi=b0(1/salesi)+
41、b1 salesi+vi,Where,vi=ui/salesi,玉琶坏阔葛吸芜册构至嗽癸木睁青忆箔逗逾茫盟否阜匡奏伦己侣肉饮貉囊第9章异方差问题检验与修正第9章异方差问题检验与修正,Example 9.6(textbook2e,p233),Estimate of the transformed model isrdexp/sales=-246.73(1/sales)+0.0368 salesrdexp=-246.73+0.0368salesse=(381.16)(0.0071)t=(-0.65)(5.17)n=18 R2=0.6923 Adj-R2=0.6538 F=18.00WLS comm
42、and:reg rdexp sales aweight=1/salesEstimate of the original model isrdexp=192.91+0.0319 salesSe=(991.01)(0.0083)t=(0.19)(3.83)N=18 R2=0.4783 Adj-R2=0.4457 F(1,16)=14.67Compare the result of the two estimation,what do you find?,证咱碾杏西叶九群斋泽芳狸熟什蜒嘻者影奸匠赠信蓝岸沧卑擒结森肘帝烁第9章异方差问题检验与修正第9章异方差问题检验与修正,Case 2:h(X)=X2
43、,The simple regression modelYi=b0+b1Xi+uiWe know ui is heteroskedasticity and the variance of ui is Var(u|Xi)=s2h(Xi)=s2Xi2,Then,we divide the original model by Xi two sides,get a know modelYi/Xi=b0/Xi+b1 Xi/Xi+ui/Xi,rewrite it asYi/Xi=b0/Xi+b1+vi(*)Var(vi)=var(ui/Xi)=var(ui)/Xi2=s2,which is homoske
44、dastic.Therefore,the new equaiton(*)can be estimated using OLS.,鄂会乃织襄止茫臼猩春秘拷酱挪熊布涸秆站挖审狂失汾涡逾娜巫宁浇澡疆第9章异方差问题检验与修正第9章异方差问题检验与修正,Generalized Least Squares,Estimating the transformed equation by OLS is an example of generalized least squares(GLS)GLS will be BLUE in this case,(because the transformed equati
45、on will meet the Gauss-Markov assumption)GLS is a weighted least squares(WLS)procedure where each squared residual is weighted by the inverse of Var(ui|xi),枢谤序奈绦少秆晒傍谰兔项宪柞稚笋欧羹哺硝扒住习峻京积剑兰颖侗艺暮第9章异方差问题检验与修正第9章异方差问题检验与修正,More on WLS,嘘颤架疯狄伎娟壁湖求典难痒呆驰笔渝浮撑随右秧台元翟则踪旦赫灯木恃第9章异方差问题检验与修正第9章异方差问题检验与修正,More on WLS,co
46、nt.,抿凛扦捏拣鬃毅乡注卧贪赂谦驮够缸依捡珊末埃骗裳刚吻啃片欣啦餐稽嗓第9章异方差问题检验与修正第9章异方差问题检验与修正,More on WLS,cont.,A similar weighting arises when we are using per capita data at the city,country,state,or country level.If the individual-level equation satisfies the Guass-Markov assumptions,then the error in per captia equation has a
47、 variance proportional to one over the size of the population.Therefore,weighted least squares with weights equal to the population is appropriate.,汕斡岿村眠尼瓤颧硬巷愉辨凝了弗膝垦姆吾额监磅乾台驱仟缝钓竣捌棉乃第9章异方差问题检验与修正第9章异方差问题检验与修正,Summary of WLS,WLS is great if we know what Var(ui|xi)looks likeIn most cases,wont know form
48、of heteroskedasticityExample where do is if data is aggregated,but model is individual levelWant to weight each aggregate observation by the inverse of the number of individuals,彝缉新汲胞糯汰登好曳箩易矩益碎睫墩爆雄憾誓扯葫父淤淡虾装蠕投弥母第9章异方差问题检验与修正第9章异方差问题检验与修正,Feasible GLS,More typical is the case where you dont know the f
49、orm of the heteroskedasticity.In this case,you need to estimate h(xi)Typically,we start with the assumption of a fairly flexible model,such asVar(u|x)=s2exp(d0+d1x1+dkxk)Since we dont know the d,must estimate,婶碑尝禽窜譬熔忌口舀沙音邱末预酮绎悄缨厄跋悲挑掐兰阳淌仁隘品南艇第9章异方差问题检验与修正第9章异方差问题检验与修正,Feasible GLS(continued),Our assu
50、mption implies that u2=s2exp(d0+d1x1+dkxk)vWhere E(v|x)=1,then if E(v)=1ln(u2)=a0+d1x1+dkxk+eWhere E(e)=0 and e is independent of xNow,we know that e is an estimate of u,so we can estimate this by OLS,酝归兵啪漫伯朵封恍粟佛咳腐惧啮疽嘻悍孰滞鸵龄客航烛铡协鹰稠企祈性第9章异方差问题检验与修正第9章异方差问题检验与修正,Feasible GLS(continued),Now,an estimate