《R软件计算题-统计学专业课件.ppt》由会员分享,可在线阅读,更多相关《R软件计算题-统计学专业课件.ppt(25页珍藏版)》请在三一办公上搜索。
1、例,4.15 P179,(一个正态总体的区间估计),为估计一件物体的重量,a,,将其称了,10,次,得到的重量(单位:,kg,),为,10.1,10,9.8,10.5,9.7,10.1,9.9,10.2,10.3,9.9,,假设所称出物体重,量服从正态分布,求该物体重量,a,的置信系数为,0.95,的置信区间。,?,x-c(10.1,10,9.8,10.5,9.7,10.1,9.9,10.2,10.3,9.9),?,t.test(x),?,程序结果:,?,One Sample t-test,?,data:x,?,t=131.59,df=9,p-value=4.296e-16,?,alterna
2、tive hypothesis:true mean is not equal to 0,?,95 percent confidence interval:,?,9.877225 10.222775,?,sample estimates:,?,mean of x,?,10.05,得到的区间估计为:,9.88,10.22,?,例,4.18 P185(,均值差的区间估计),现从生产线上随机抽取样本,x1,x2,,,,,x12,和,y1,y2,,,,,y17,,都服,从正态分布,其均值分别为,u1=201.1,u2=499.7,,标准差分别为,2.4,4.7,。,给定置信系数,0.95,,试求,u1-
3、u2,的区间估计。,?,x-rnorm(12,501.1,2.4),?,y-rnorm(17,499.7,4.7),?,两样本方差不同,t.test(x,y),?,程序结果:,Welch Two Sample t-test,?,data:x and y,?,t=-0.6471,df=25.304,p-value=0.5234,?,alternative hypothesis:true difference in means is not equal to 0,?,95 percent confidence interval:,?,-3.657121 1.907620,?,sample esti
4、mates:,?,mean of x mean of y,?,500.7888 501.6635,?,u1-u2,的置信系数为,0.95,的区间估计为,-3.66,1.91,?,方差相同,t.test(x,y,var.equal=TRUE),例,4.19 P186(,配对数据情形下均值差的区间估计),抽查患者,10,名。记录下治疗前后血红蛋白的含量数据。试,求治疗前后变化的区间估计。,(,a=0.05,)。,?,x-c(11.3,15.0,15.0,13.5,12.8,10.0,11.0,12.0,13.0,12.3),?,y-c(14.0,13.8,14.0,13.5,13.5,12.0,1
5、4.7,11.4,13.8,12.0),?,t.test(x-y),?,程序结果:,?,One Sample t-test,?,data:x-y,?,t=-1.3066,df=9,p-value=0.2237,?,alternative hypothesis:true mean is not equal to 0,?,95 percent confidence interval:,?,-1.8572881 0.4972881,?,sample estimates:,?,mean of x,?,-0.68,?,治疗前后变化的区间估计为,-1.86,0.497,例,4.22 P193,(一个总体求
6、均值的单侧置信区间估计),从一批灯泡中随机地取,5,只作寿命试验测得寿命以小时计为,1050 1100 1120 1250 1280,设灯泡的寿命服从正态分布,.,求,灯泡寿命平均值的置信度为,0.95,的单侧置信下限,?,x-c(1050,1100,1120,1250,1280),?,t.test(x,alternative=greater),?,程序结果:,One Sample t-test,?,data:x,?,t=26.003,df=4,p-value=6.497e-06,?,alternative hypothesis:true mean is greater than 0,?,95
7、 percent confidence interval:,?,1064.9 Inf,?,sample estimates:,?,mean of x,?,1160,?,95%,的灯泡寿命在,1064.9,小时以上,习题,4.6 P201,甲、乙两种稻种分布播种在,10,块试验田中,每块试验田甲、乙稻种各种一半,,假设两稻种产量,X,Y,均服从正态分布,且方差相等,收获后,10,块试验田的产量,如下所示(单位:千克)。求出两稻种产量的期望差,u1-u2,的置信区间,(,a=0.05).,?,x-c(140,137,136,140,145,148,140,135,144,141),?,y-c(13
8、5,118,115,140,128,131,130,115,131,125),?,t.test(x,y,var.equal=T),?,程序结果,?,Two Sample t-test,?,data:x and y,?,t=4.6287,df=18,p-value=0.0002087,?,alternative hypothesis:true difference in means is not equal to 0,?,95 percent confidence interval:,?,7.536261 20.063739,?,sample estimates:,?,mean of x mea
9、n of y,?,140.6 126.8,?,置信区间为,7.536261,20.063739,习题,4.7,甲、乙两组生产同种导线,现从甲组生产的导线中随机抽取,4,根,从乙,组生产的导线中随机抽取,5,根,它们的电阻值分别为:甲:,0.143,0.142,0.143,0.137,;,乙:,0.140,0.142,0.136,0.138,0.140,;假设两组电阻值分别服从正态分,布,方差相同但未知,试求,u1-u2,的置信系数为,0.95,的区间估计。,?,x-c(0.143,0.142,0.143,0.137),?,y-c(0.140,0.142,0.136,0.138,0.140),?
10、,a-rnorm(4,mean(x),var(x),?,b-rnorm(5,mean(y),var(y),?,t.test(a,b),?,程序结果:,Welch Two Sample t-test,?,data:a and b,?,t=636.28,df=5.788,p-value=3.028e-15,?,alternative hypothesis:true difference in means is not equal to 0,?,95 percent confidence interval:,?,0.002041440 0.002057343,?,sample estimates:,
11、?,mean of x mean of y,?,0.1412494 0.1392000,?,区间为:,0.00204,0.00205,例,5.2 P209,(单个正态总体均值的假设检验),某种元件的寿命,X,(小时),服从正态分布,其中,f,方差和均值均未知,,16,只,元件的寿命如下:问是否有理由认为元件的平均寿命大于,255,小时。,?,x-,c(159,280,101,212,224,379,179,264,222,362,168,250,149,260,485,170),?,t.test(x,alternative=greater,mu=225),?,程序结果:,One Sample
12、t-test,?,data:x,?,t=0.66852,df=15,p-value=0.257,?,alternative hypothesis:true mean is greater than 225,?,95 percent confidence interval:,?,198.2321 Inf,?,sample estimates:,?,mean of x,例,5.6 P221,(二项分布总体的假设检验),有一批蔬菜种子的平均发芽率为,P=0.85,现在随机抽取,500,粒,用种衣剂进行浸,种处理,结果有,445,粒发芽,问种衣剂有无效果。,?,binom.test(445,500,p
13、=0.85),?,程序结果:,Exact binomial test,?,data:445 and 500,?,number of successes=445,number of trials=500,p-value=,0.01207,?,alternative hypothesis:true probability of success is not equal to,0.85,?,95 percent confidence interval:,?,0.8592342 0.9160509,?,sample estimates:,?,probability of success,?,0.89,
14、?,P,值,=0.012070.05,,拒绝原假设,认为种衣剂对种子发芽率有显著,效果。,习题,5.1 P249,正常男子血小板计数均值为,225*109/L,今测得,20,名男性油漆作业工,人的血小板计数值如下。问油漆工人的血小板计数与正常成年男子,有无差异?,?,x-,c(220,188,162,230,145,160,237,188,247,113,126,245,164,231,250,18,3,190,158,224,175),?,t.test(x,alternative=wo.side,mu=225),?,程序结果:,One Sample t-test,?,data:x,?,t=-
15、3.5588,df=19,p-value=0.002096,?,alternative hypothesis:true mean is not equal to 225,?,95 percent confidence interval:,?,172.2743 211.3257,?,sample estimates:,?,mean of x,?,191.8,?,P,值,=0.0020960.05,拒绝原假设,认为油漆工人的血小板计数与正常成,年男子有差异。,习题,5.3,为研究某铁剂治疗和饮食治疗营养性缺铁性贫血的效果,将,16,名患者按年龄、,体重、病程和病情相近的原则配成,8,对,分别使用饮
16、食疗法和补充铁剂治疗的,方法,三个月后测得两种患者血红蛋白如表,5.1,所示,问两种方法治疗后的患,者血红蛋白有无差异,.,?,x-c(113,120,138,120,100,118,138,123),?,y-c(138,116,125,136,110,132,130,110),?,t.test(x-y),?,程序结果:,One Sample t-test,?,data:x-y,?,t=-0.65127,df=7,p-value=0.5357,?,alternative hypothesis:true mean is not equal to 0,?,95 percent confidence
17、 interval:,?,-15.628891 8.878891,?,sample estimates:,?,mean of x,?,-3.375,?,P=0.5370.05,接受原假设,两种方法治疗后的患者血红蛋白无差异,例,6.2 P257,(回归方程的显著性检验),求例,6.1,的回归方程,并对相应的方程做检验。,?,x-c(0.1,0.11,0.12,0.13,0.14,0.15,0.16,0.17,0.18,0.20,0.21,0.23),?,y-c(42.0,43.5,45.0,45.5,45.0,47.5,49.0,53.0,50.0,55.0,55.0,60.0),?,lm.s
18、ol-lm(y1+x),?,summary(lm.sol),?,程序结果见下一张,PPT,?,回归方程为:,?,从回归结果可以看出,回归方程通过了回归参数的检验与回归方程,的检验。,?,28.493,130.835,Y,X,?,?,例,6.2,的程序结果,?,程序结果:,Call:,?,lm(formula=y 1+x),?,Residuals:,?,Min 1Q Median 3Q Max,?,-2.0431-0.7056 0.1694 0.6633 2.2653,?,Coefficients:,?,Estimate Std.Error t value Pr(|t|),?,(Intercep
19、t)28.493 1.580 18.04 5.88e-09*,?,x 130.835 9.683 13.51 9.50e-08*-,?,Signif,.codes:0*0.001*0.01*0.05.0.1 1,?,Residual standard error:1.319 on 10 degrees of freedom,?,Multiple R-squared:0.9481,Adjusted R-squared:0.9429,?,F-statistic:182.6 on 1 and 10 DF,p-value:9.505e-08,例,6.4 P260,(预测),求例,6.1,中,X=x0=
20、0.16,时相应的,Y,的概率为,0.95,的,预测区间,?,new-data.frame(x=0.16),?,lm.pred-,predict(lm.sol,new,interval=prediction,level,=0.95),?,lm.pred,?,程序结果:,fit lwr upr,?,49.42639 46.36621 52.48657,?,预测值为,49.43,,预测区间,46.37,52.49,例,6.5 P261,(全面展示一元回归模型的计算过程),Forbes,数据,?,X-matrix(c(194.5,20.79,1.3179,131.79,194.3,20.79,1.3
21、179,131.79,197.9,22.40,1.3502,135.02,198.4,22.67,1.3555,135.55,199.4,23.15,1.3646,136.46,199.9,23.35,1.3683,136.83,200.9,23.89,1.3782,137.82,201.1,23.99,1.3800,138.00,201.4,24.02,1.3806,138.06,201.3,24.01,1.3805,138.05,203.6,25.14,1.4004,140.04,204.6,26.57,1.4244,142.44,209.5,28.49,1.4547,145.47,208
22、.6,27.76,1.4434,144.34,210.7,29.04,1.4630,146.30,211.9,29.88,1.4754,147.54,212.2,30.06,1.4780,147.80),?,ncol=4,byrow=T,dimnames=list(1:17,c(F,h,log,log100),?,forbes-data.frame(X),?,plot(forbes$F,forbes$log100),?,程序结果是出现,散点图,?,lm.sol-lm(log100F,data=forbes),?,summary(lm.sol),?,程序结果:,Call:,?,lm(formul
23、a=log100 F,data=forbes),?,Residuals:,?,Min 1Q Median 3Q Max,?,-0.32261-0.14530-0.06750 0.02111 1.35924,?,Coefficients:,?,Estimate Std.Error t value Pr(|t|),?,(Intercept)-42.13087 3.33895-12.62 2.17e-09*,?,F 0.89546 0.01645 54.45 2e-16*,?,-,?,Signif,.codes:0*0.001*0.01*0.05.0.1,1,?,Residual standard
24、error:0.3789 on 15 degrees of freedom,?,Multiple R-squared:0.995,Adjusted R-squared:0.9946,?,F-statistic:2965 on 1 and 15 DF,p-value:2.2e-16,?,abline(lm.sol),?,程序结果:得到,散点图和相应的回归直线,?,y.res-residuals(lm.sol);plot(y.res),?,text(12,y.res12,labels=12,adj=1.2),?,程序结果:将,第,12,号残差点标出,?,lm12-lm(log100F,data=f
25、orbes,subset=-12),?,summary(lm12),?,程序结果:,Call:,?,lm(formula=log100 F,data=forbes,subset=-12),?,Residuals:,?,Min 1Q Median 3Q Max,?,-0.21175-0.06194 0.01590 0.09077 0.13042,?,Coefficients:,?,Estimate Std.Error t value Pr(|t|),?,(Intercept)-41.30180 1.00038-41.29 5.01e-16*,?,F 0.89096 0.00493 180.73
26、2e-16*,?,-,?,Signif,.codes:0*0.001*0.01*0.05.0.1,1,?,Residual standard error:0.1133 on 14 degrees of freedom,?,Multiple R-squared:0.9996,Adjusted R-squared:0.9995,?,F-statistic:3.266e+04 on 1 and 14 DF,p-value:2.2e-16,例,6.14 P292,某公司为了研究产品的营销策略,对产品的销售情况进行了调查,设,Y,为某,地区该产品的家庭人均购买量(单位:元),,X,为家庭收入(单位:元)
27、,,表,6.8,给出了,53,个家庭的数据。试通过这些数据建立,Y,与,X,的关系。,?,X-scan(),?,679 292 1012 493 582 1156 997 2189 1097 2078,?,1818 1700 747 2030 1643 414 354 1276 745 435,?,540 874 1543 1029 710 1434 837 1748 1381 1428,?,1255 1777 370 2316 1130 463 770 724 808 790,?,783 406 1242 658 1746 468 1114 413 1787 3560,?,1495 2221
28、 1526,?,Y-scan(),?,0.79 0.44 0.56 0.79 2.70 3.64 4.73 9.50 5.34 6.85,?,5.84 5.21 3.25 4.43 3.16 0.50 0.17 1.88 0.77 1.39,?,0.56 1.56 5.28 0.64 4.00 0.31 4.20 4.88 3.48 7.58,?,2.63 4.99 0.59 8.19 4.79 0.51 1.74 4.10 3.94 0.96,?,3.29 0.44 3.24 2.14 5.71 0.64 1.90 0.51 8.33 14.94,?,5.11 3.85 3.93,?,lm.
29、sol-lm(YX);summary(lm.sol),?,程序结果:,Call:,?,lm(formula=Y X),?,Residuals:,?,Min 1Q Median 3Q Max,?,-4.1399-0.8275-0.1934 1.2376 3.1522,?,Coefficients:,?,Estimate S td.Error t value Pr(|t|),?,(Intercept)-0.8313037 0.4416121-1.882 0.0655.,?,X 0.0036828 0.0003339 11.030 4.11e-15*,?,-,?,Signif,.codes:0*0.
30、001*0.01*0.05.0.1,1,?,Residual standard error:1.577 on 51 degrees of freedom,?,Multiple R-squared:0.7046,Adjusted R-squared:0.6988,?,F-statistic:121.7 on 1 and 51 DF,p-value:4.106e-15,?,y.rst-rstandard(lm.sol);y.fit-predict(lm.sol),?,plot(y.rsty.fit),?,abline(0.1,0.5);abline(-0.1,-0.5),?,程序结果:画出了,标准
31、化后的残差图,?,lm.new-update(lm.sol,sqrt(.).);summary(lm.new),?,程序结果:,Call:,?,lm(formula=sqrt(Y)X),?,Residuals:,?,Min 1Q Median 3Q Max,?,-1.39185-0.30576-0.03875 0.25378 0.81027,?,Coefficients:,?,Estimate Std.Error t value Pr(|t|),?,(Intercept)5.822e-01 1.299e-01 4.481 4.22e-05*,?,X 9.529e-04 9.824e-05 9.
32、699 3.61e-13*,?,-,?,Signif,.codes:0*0.001*0.01*0.05.0.1,1,?,Residual standard error:0.464 on 51 degrees of freedom,?,Multiple R-squared:0.6485,Adjusted R-squared:0.6416,?,F-statistic:94.08 on 1 and 51 DF,p-value:3.614e-13,?,yn.rst-rstandard(lm.new);yn.fit-predict(lm.new),?,plot(yn.rstyn.fit),习题,6.1
33、P331,为估计山上积雪融化后对下游灌溉的影响,在山上建立一个观测站,测量最大,积雪深度,X,(米)与当年灌溉面积,Y,(公颂),测得连续,10,年的数据如表,6.1,所,示。,?,snow-data.frame(,?,y=c(1907,1287,2700,2373,3260,3000,1974,2273,3113,2493),?,x=c(5.1,3.5,7.1,6.2,8.8,7.8,4.5,5.6,8.0,6.4),?,1,),plot(yx,data=snow)#,生成散点图,据此得出变量之间的关系,?,lm.sol-lm(yx-1,data=snow)#,生成过原点的直线,?,summ
34、ary(lm.sol)#,提取模型中包含的信息,?,abline(lm.sol)#,生成拟合曲线,?,程序结果:,Call:,?,lm(formula=y x-1,data=snow),?,Residuals:,?,Min 1Q Median 3Q Max,?,-132.53-53.63-12.10 28.09 239.18,?,Coefficients:,?,Estimate Std.Error t value Pr(|t|),?,x 385.51 5.09 75.74 6.17e-14*,?,-,?,Signif,.codes:0*0.001*0.01*0.05.0.1,1,?,Resid
35、ual standard error:104.6 on 9 degrees of freedom,?,Multiple R-squared:0.9984,Adjusted R-squared:0.9983,?,F-statistic:5736 on 1 and 9 DF,p-value:6.169e-14,现测得今年的数据是,X=7m,,给出今年灌溉面积的预测值,与相应的区间估计。(,a=0.05,),?,4,),X-data.frame(x=7),?,lm.pred-,predict(lm.sol,X,interval=prediction,level=0.,95),?,lm.pred#,拟
36、合新数据,并生成置信区间,?,程序结果:,fit lwr upr,?,1 2698.603 2448.718 2948.488,例,7.2 P339(,方差分析表的计算),(续例,7.1),?,lamp-data.frame(X=c(1600,1610,1650,1680,1700,1700,1780,1500,640,?,1400,1700,1750,1640,1550,1600,1620,1640,1600,1740,1800,1510,1520,1530,1570,1640,1600),?,A=factor(c(rep(1,7),rep(2,5),rep(3,8),rep(4,6),?,
37、lamp.aov-aov(X A,data=lamp),?,summary(lamp.aov),?,程序结果:,?,Df Sum Sq Mean Sq F value Pr(F),?,A 3 49212 16404 2.166 0.121,?,Residuals 22 166622 7574,例,7.3 P341,小白鼠在接种了,3,种不同菌型的伤寒杆菌后的存活天数如表所示,判,断小白鼠被注射,3,种菌型后的平均存活天数有无显著差异?,?,mouse-data.frame(,?,X=c(2,4,3,2,4,7,7,2,2,5,4,5,6,8,5,10,7,?,12,12,6,6,7,11,6,
38、6,7,9,5,5,10,6,3,10),?,A=factor(c(rep(1,11),rep(2,10),rep(3,12),?,),?,mouse.aov-aov(X A,data=mouse),?,summary(mouse.aov),?,程序结果:,Df Sum Sq Mean Sq F value Pr(F),?,A 2 94.26 47.13 8.484 0.0012*,?,Residuals 30 166.65 5.56,?,-,?,Signif,.codes:0*0.001*0.01*0.05.0.1 1,?,P,值远小于,0.01,应拒绝原假设,即认为小白鼠被注射,3,种菌型
39、后的平均,存活天数有显著差异,习题,7.1 P371,3,个工厂生产同一种零件。现从各厂产品中选取,4,件产品做检测,其,检测程度如表所示:,?,lamp-,data.frame(X=c(115,116,98,83,103,107,118,116,73,89,85,97),?,A=factor(rep(1:3,c(4,4,4),?,1,),对数据作方差分析,判断,3,个厂生产的产品的零件强度是否,有显著差异,?,lamp.aov-aov(XA,data=lamp);summary(lamp.aov),?,程序结果:,?,Df Sum Sq Mean Sq F value,Pr(F),?,A 2
40、 1304 652.0 4.923 0.0359,*,?,Residuals 9 1192 132.4,?,-,?,Signif,.codes:0*0.001*0.01*0.05.0.1,1,?,P,值大于,0.01,,接受原假设。没有显著差异,(,2,)对每个工厂生产的产品零件强度的均值,作出相应的区间估计,?,甲的区间估计,?,a-c(115,116,98,83),?,t.test(a),?,程序结果:,One Sample t-test,?,data:a,?,t=13.134,df=3,p-value=0.0009534,?,alternative hypothesis:true mean is not equal to 0,?,95 percent confidence interval:,?,78.04264 127.95736,?,sample estimates:,?,mean of x,?,103,?,均值为,103,,区间估计为(,78.04264,,,127.9574,),?,同理,乙的均值为,111,,区间估计为(,99.59932,,,122.4007,),?,3,)多重,t,检验,?,attach(lamp),?,pairwise.t.test(X,A,p.adjust.method=one),