《多变量方差分析.ppt》由会员分享,可在线阅读,更多相关《多变量方差分析.ppt(51页珍藏版)》请在三一办公上搜索。
1、多元统计分析方法,The Methods of Multivariate Statistical Analysis,第四章,多变量方差分析,什么是多变量方差分析?多变量方差分析在医学中的应用,方差分析的分类,单反应变量(y),多反应变量(y1,y2yk),单效应因子(A),双效应因子(A,B),多效应因子(A,B,C),无交互效应有交互效应,2)根据效应因子的随机性:固定模型(fixed model):效应因子是专门指定的。随机模型(random model):效应因子是从很多的因子中随机抽取出来的。混合模型(mixed model):效应因子包含两种类型因子。,1)根据变量的个数:,什么是多
2、变量方差分析?MANOVA,分析一个或多个效应因子是如何影响一组反应变量的。,身高:y1体重:y2胸围:y3,=,父母SES,舒张压:y1收缩压:y2,=,职务,生活方式,+,反应变量 效应因子,多变量方差分析在医学中的应用实例,1、单组设计资料的MANOVA2、配对设计资料的MANOVA3、成组设计资料的MANOVA4、多因子的MANOVA5、重复设计资料的MANOVA6、有协变量的MANOVA,【例4-1】单组设计资料的MANOVA实例 为了了解某地在不同时期的儿童生长发育情况,调查了20名8岁男童身高(x1)、体重(x2)、胸围(x3),数据列在表4-6中。10年前该地大量调查获得身高、
3、体重、胸围的均值分别为:121.57cm、21.54kg、57.98cm。试问:本次调查结果与10年前结果是否相同?,表4-6 儿童生长发育情况调查数据,【SAS程序】data eg4_1;input id x1 x2 x3;y1=x1-121.57;y2=x2-21.54;y3=x3-57.98;cards;1 141.2 31.8 63.6 20 121.4 19.1 56.5run;proc means;var y1-y3;run;proc glm;model y1 y2 y3=/ss3 nouni;manova h=intercept/printe printh;run;,【SAS输出
4、的结果】The MEANS Procedure Variable N Mean Std Dev Minimum Maximum-y1 20 7.170000 4.7157519-0.170000 19.63000 y2 20 2.525000 3.1504845-2.740000 10.26000 y3 20 2.365000 3.8276659-6.780000 7.82000-,The GLM Procedure Number of observations 20 Multivariate Analysis of VarianceMANOVA Test Criteria and Exact
5、 F Statistics for the Hypothesis of No Overall Intercept EffectStatistic Value F Value Num DF Den DF Pr FWilks Lambda 0.20656246 21.77 3 17.0001Pillais Trace 0.79343754 21.77 3 17.0001Hotelling-Lawley Trace 3.84115073 21.77 3 17.0001Roys Greatest Root 3.84115073 21.77 3 17.0001,结论:因为P0.0001,说明该地本次对8
6、岁男童以身高、体重、胸围三个指标为代表的儿童生长发育情况与10年前调查结果的差异存在极显著性。本次该地8岁男童的身高、体重与胸围都比10年前有所增加。,【例4-2】配对设计资料的MANOVA实例对9名乳腺癌患者进行大剂量化疗。表4-7列出的是化疗前、后测量其血液中尿素氮BUN(mg%)与血清肌酐Gr(mg%)水平的结果。试问:该化疗对患者的肾功能有无影响?,表4-7 乳腺癌患者化疗前后BUN和Gr检测数据,【SAS程序】data eg4_2;input id x0 x1 y0 y1;d1=x1-x0;d2=y1-y0;cards;1 11.7 10.6 1.3 0.8 9 14.6 13.8
7、0.9 0.8run;proc means;var d1 d2;run;proc glm;model d1 d2=/ss3 nouni;manova h=intercept;run;,【SAS 主要输出结果】:The MEANS Procedure Variable N Mean Std Dev Minimum Maximum-d1 9-0.1666667 1.9924859-3.2000000 3.6000000 d2 9-0.1666667 0.2598076-0.6000000 0.3000000-The GLM Procedure MANOVA Test Criteria and Ex
8、act F Statistics for the Hypothesis of No Overall Intercept EffectStatistic Value F Value Num DF Den DF Pr FWilks Lambda 0.61026828 2.24 2 7 0.1776Pillais Trace 0.38973172 2.24 2 7 0.1776Hotelling-Lawley Trace 0.63862358 2.24 2 7 0.1776Roys Greatest Root 0.63862358 2.24 2 7 0.1776,【例4-3】成组设计资料的MANOV
9、A实例为了研究某种疾病的治疗,观察了24个病人使用三种不同药品后的两个指标,每种药品观察了4个男性和4个女性,数据列在表4-8中。试比较药品对两个指标所起的作用。,表4-8 三种不同药品用药后的观察数据,【SAS程序】data eg4_3;input sex$drug$;input y1 y2;output;input y1 y2;output;input y1 y2;output;input y1 y2;output;cards;M A 5 6 5 4 9 9 7 6 F C 14 13 12 12 12 10 8 7run;proc glm manova;classes drug;mode
10、l y1 y2=drug/nouni;contrast Drug A vs B drug 1-1 0;contrast Drug A vs C drug 1 0-1;contrast Drug B vs C drug 0 1-1;manova h=drug;means drug;run;,【SAS部分 输出结果】General Linear Models ProcedureClass Level InformationClass Levels ValuesSEX 2 Female MaleDRUG 3 A B CNumber of observations in data set=24Mult
11、ivariate Analysis of VarianceManova Test Criteria and F Approximations for the Hypothesis of no Overall DRUG EffectStatistic Value F Num DF Den DF Pr FWilks Lambda 0.21763115 11.4358 4 40 0.0001Pillais Trace 0.88366412 8.3115 4 42 0.0001Hotelling-Lawley Trace 3.12948583 14.8651 4 38 0.0001Roys Great
12、est Root 2.97292461 31.2157 2 21 0.0001,Manova Test Criteria and Exact F Statistics for the Hypothesis of no Overall Drug A vs B EffectStatistic Value F Num DF Den DF Pr FWilks Lambda 0.86446183 1.5679 2 20 0.2331Pillais Trace 0.13553817 1.5679 2 20 0.2331Hotelling-Lawley Trace 0.15678908 1.5679 2 2
13、0 0.2331Roys Greatest Root 0.15678908 1.5679 2 20 0.2331Manova Test Criteria and Exact F Statistics for the Hypothesis of no Overall Drug A vs C EffectStatistic Value F Num DF Den DF Pr FWilks Lambda 0.30389066 22.9066 2 20 0.0001Pillais Trace 0.69610934 22.9066 2 20 0.0001Hotelling-Lawley Trace 2.2
14、9065729 22.9066 2 20 0.0001Roys Greatest Root 2.29065729 22.9066 2 20 0.0001,Manova Test Criteria and Exact F Statistics for the Hypothesis of no Overall Drug B vs C EffectStatistic Value F Num DF Den DF Pr FWilks Lambda 0.30799724 22.4678 2 20 0.0001Pillais Trace 0.69200276 22.4678 2 20 0.0001Hotel
15、ling-Lawley Trace 2.24678238 22.4678 2 20 0.0001Roys Greatest Root 2.24678238 22.4678 2 20 0.0001Level of-Y1-Y2-DRUG N Mean SD Mean SDA 8 5.6250000 1.84681192 5.6250000 1.76776695B 8 6.1250000 1.55264751 7.1250000 2.29518129C 8 13.2500000 2.96407056 11.3750000 2.38671921,【例4-4】析因设计资料的MANOVA实例为了研究某种疾
16、病的治疗,观察了24个病人使用三种不同药品后的两个指标,每种药品观察了4个男性和4个女性,数据列在表4-8中。试分析性别和药品对两个指标所起的作用。,表4-8 三种不同药品用药后的观察数据,【SAS 程序】proc glm data=eg4_3 manova;classes sex drug;model y1 y2=sex drug sex*drug/nouni;contrast Drug A vs B drug 1-1 0;contrast Drug A vs C drug 1 0-1;contrast Drug B vs C drug 0 1-1;contrast Drug A vs B/
17、sex=m drug 1-1 0 sex*drug 1-1 0 0 0 0;contrast Drug A vs B/sex=f drug 1-1 0 sex*drug 0 0 0 1-1 0;contrast Drug A vs C/sex=m drug 1 0-1 sex*drug 1 0-1 0 0 0;contrast Drug A vs C/sex=f drug 1 0-1 sex*drug 0 0 0 1 0-1;contrast Drug B vs C/sex=m drug 0 1-1 sex*drug 0 1-1 0 0 0;contrast Drug B vs C/sex=f
18、 drug 0 1-1 sex*drug 0 0 0 0 1-1;manova h=sex drug sex*drug;means sex drug;run;,【SAS主要输出结果】General Linear Models ProcedureMultivariate Analysis of VarianceManova Test Criteria and Exact F Statistics for the Hypothesis of no Overall SEX EffectStatistic Value F Num DF Den DF Pr FWilks Lambda 0.6025598
19、6 5.6065 2 17 0.0135Pillais Trace 0.39744014 5.6065 2 17 0.0135Hotelling-Lawley Trace 0.65958615 5.6065 2 17 0.0135Roys Greatest Root 0.65958615 5.6065 2 17 0.0135Manova Test Criteria and F Approximations for the Hypothesis of no Overall DRUG EffectStatistic Value F Num DF Den DF Pr FWilks Lambda 0.
20、13856520 14.3345 4 34 0.0001Pillais Trace 0.98043004 8.6545 4 36 0.0001Hotelling-Lawley Trace 5.35805223 21.4322 4 32 0.0001Roys Greatest Root 5.19267163 46.7340 2 18 0.0001,【例4-5】重复测量设计资料的MANOVA实例欲比较两种治疗(胸腔切开术grp=1和胸腔镜检术grp=2)方案的效果,将40个病人随机分成两组,分别在术前、术后2天、术后7天测定患者的T细胞结果列在表4-9中。,表4-9 胸腔切开术和胸腔镜检术患者的T
21、细胞测定结果,【SAS程序】data eg4_5;do id=1 to 22;do grp=1 to 2;input d1 d2 d3;output;end;end;cards;74 68 71 46 61 58.83 87 83run;proc glm;class grp;model d1 d2 d3=grp/nouni ss3;repeated time 3 contrast(1)/printe summary;lsmeans grp/stderr;run;,【SAS输出结果】General Linear Models ProcedureClass Level InformationCla
22、ss Levels ValuesGRP 2 1 2Number of observations in data set=44NOTE:Observations with missing values will not be included in this analysis.Thus,only 41 observations can be used in this analysis.Test for Sphericity:Mauchlys Criterion=0.6611213Chisquare Approximation=15.725084 with 2 df Prob Chisquare=
23、0.0004Applied to Orthogonal Components:Test for Sphericity:Mauchlys Criterion=0.9639864Chisquare Approximation=1.3937694 with 2 df Prob Chisquare=0.4981,Manova Test Criteria and Exact F Statistics for the Hypothesis of no time EffectStatistic Value F Value Num DF Den DF Pr FWilks Lambda 0.84063314 3
24、.60 2 38 0.0369Pillais Trace 0.15936686 3.60 2 38 0.0369Hotelling-Lawley Trace 0.18957956 3.60 2 38 0.0369Roys Greatest Root 0.18957956 3.60 2 38 0.0369 Manova Test Criteria and Exact F Statistics for the Hypothesis of no time*grp EffectStatistic Value F Value Num DF Den DF Pr FWilks Lambda 0.963265
25、66 0.72 2 38 0.4911Pillais Trace 0.03673434 0.72 2 38 0.4911Hotelling-Lawley Trace 0.03813522 0.72 2 38 0.4911Roys Greatest Root 0.03813522 0.72 2 38 0.4911,General Linear Models ProcedureRepeated Measures Analysis of VarianceTests of Hypotheses for Between Subjects EffectsSource DF Type III SS Mean
26、 Square F Value Pr FGRP 1 22.93919944 22.93919944 0.09 0.7599Error 39 9447.46730463 242.24275140,General Linear Models ProcedureRepeated Measures Analysis of VarianceUnivariate Tests of Hypotheses for Within Subject EffectsSource:TIME Adj Pr F DF Type III SS Mean Square F Value Pr F G-G H-F 2 169.29
27、178823 84.64589411 3.07 0.0519 0.0538 0.0519Source:TIME*GRP Adj Pr F DF Type III SS Mean Square F Value Pr F G-G H-F 2 46.49504026 23.24752013 0.84 0.4337 0.4303 0.4337Source:Error(TIME)DF Type III SS Mean Square 78 2147.53748006 27.53253180Greenhouse-Geisser Epsilon=0.9652 Huynh-Feldt Epsilon=1.040
28、6,General Linear Models ProcedureRepeated Measures Analysis of VarianceAnalysis of Variance of Contrast VariablesTIME.N represents the contrast between the nth level of TIME and the 1stContrast Variable:TIME.2Source DF Type III SS Mean Square F Value Pr FMEAN 1 29.24868713 29.24868713 0.45 0.5080GRP
29、 1 82.90722371 82.90722371 1.27 0.2675Error 39 2554.99521531 65.51269783Contrast Variable:TIME.3Source DF Type III SS Mean Square F Value Pr FMEAN 1 321.68887852 321.68887852 6.48 0.0150GRP 1 3.24985413 3.24985413 0.07 0.7994Error 39 1936.55502392 49.65525702,Least Squares Means Standard grp d1 LSME
30、AN Error Pr|t|1 71.0000000 2.3785943|t|1 70.4210526 2.4944288|t|1 73.5263158 1.9411064.0001 2 73.8181818 1.8039098.0001,结论:1)各时间点上T细胞数有显著性差异(P=0.0369);术前和术后2天没有显著性差异(P=0.5080);术前和术后7天有显著性差异(P=0.0150);2)两种治疗效果无显著性差异(P=0.4911);3)两个处理组在各时间点上T细胞数的差异相同;,1、重复测量设计,重复测量设计(repeated measure design)是指对同一观察对象的同
31、一观察指标在不同时间点上进行多次测量,用于分析该观察指标在不同时间上的变化特点的一种实验设计方法。,特点:1)从同一个受试对象上获取多个观察值;2)各个时间点上反应变量的观测结果相关。使用方法:注意区分具有几个重复测量的几因素设计。,例如,,2、重复测量资料的方差分析对协方差阵的要求,1)样本是随机的;2)在处理因素的同一个水平上的观察是独立的;3)每一水平上的测定值都来自正态总体外;4)协方差阵(covariance matrix)是球形性(sphericity)的。若球形性质得不到满足,则方差分析的F值是有偏的,这会造成过多的拒绝本来是真的无效假设。,3、协方差阵的概念,方差是指在某一时点
32、上测定值变异性的大小,而协方差是指在两个不同时点上测定值相互变异性的大小。如果在某个时点上的取值不影响其他时点上的取值,则协方差为0,反之,则不为0。由方差协方差构成的矩阵称协方差阵。,四个测试点。主对角线是方差,其余是协方差。,协方差阵的球形性质是指该矩阵主对角线元素(方差)相等、非主对角线元素(协方差)为零。用Mauchly氏法检验协方差阵的球形性质。Mauchly氏检验的P值若大于研究者所选择的显著性水准时,说明协方差阵的球形性质得到满足。否则,必须对与时间有关的F统计量的分子、分母自由度进行调整,以便减少犯 I 类错误的概率。,4、协方差阵的球形性检验,球形性检验(Sphericity
33、 test):p0.05,单变量方差分析,将时间作为一个效应因子。P0.05,多变量方差分析,各时间点上的观察值为反应变量。,校正样本之间的相关性,例:20个病人分别接受了两种不同的治疗处理(A=1,2)。治疗后以30分钟的时间间隔测量病人的前额体温(T=1,2,3,4)。测量结果列在下表中。试分析:(1)两组病人的体温有无显著性差异?(2)T2时刻测得较高的体温,问其它各时间点的体温 与T2时刻有无显著性差异?,分析:1)同一个受试对象被观察了4次体温;2)4个时间点上反应变量(体温)的观测结果相关;3)这是一个重复测量的两因素重设计的方差分析。,data repeat;do a=1 to
34、2;do id=1 to 10;input t1-t4;output;end;end;cards;30.9 31.7 30.9 30.9.32.3 33.5 32.6 32.7run;proc glm;class a;model t1 t2 t3 t4=a/nouni;repeated time 4 contrast(2)/printe summary;lsmeans a/pdiff;run;,Partial Correlation Coefficients DF=18 T1 T2 T3 T4T1 1.000000 0.597647 0.646996 0.740241 0.0001 0.006
35、9 0.0028 0.0003T2 0.597647 1.000000 0.739524 0.704796 0.0069 0.0001 0.0003 0.0008T3 0.646996 0.739524 1.000000 0.925437 0.0028 0.0003 0.0001 0.0001T4 0.740241 0.704796 0.925437 1.000000 0.0003 0.0008 0.0001 0.0001Test for Sphericity:Mauchlys Criterion=0.0858503Chisquare Approximation=41.055558 with
36、5 dfProb Chisquare=0.0000Applied to Orthogonal Components:Test for Sphericity:Mauchlys Criterion=0.3484099Chisquare Approximation=17.631505 with 5 dfProb Chisquare=0.0034,SAS输出结果,相关分析结果说明重复测量变量之间高度相关。,球形检验结果说明该数据应当使用多变量方差分析方法。,Manova Test Criteria and Exact F Statistics forthe Hypothesis of no TIME
37、EffectH=Type III SS&CP Matrix for TIME E=Error SS&CP MatrixS=1 M=0.5 N=7Statistic Value F Num DF Den DF Pr FWilks Lambda 0.102088 46.909 3 16 0.0001Pillais Trace 0.897911 46.909 3 16 0.0001Hotelling-Lawley Tra 8.795432 46.909 3 16 0.0001Roys Greatest Root 8.795432 46.909 3 16 0.0001,SAS输出结果,多变量方差分析结
38、果说明时间对体温有显著性影响。,Manova Test Criteria and Exact F Statistics forthe Hypothesis of no TIME*A EffectH=Type III SS&CP Matrix for TIME*A E=Error SS&CP MatrixS=1 M=0.5 N=7Statistic Value F Num DF Den DF Pr FWilks Lambda 0.90270121 0.57486 3 16 0.6398Pillais Trace 0.09729879 0.57486 3 16 0.6398Hotelling-La
39、wley 0.10778626 0.57486 3 16 0.6398Roys Greatest Ro 0.10778626 0.57486 3 16 0.6398,SAS输出结果,多变量方差分析结果说明时间与处理方法之间对体温的交互影响不显著。,General Linear Models ProcedureRepeated Measures Analysis of VarianceUnivariate Tests of Hypotheses for Within Subject EffectsSource:TIME Adj Pr FDF Type III SS Mean Square F V
40、alue Pr F G-G H-F 3 20.57537500 6.85845833 77.23 0.0001 0.0001 0.0001Source:TIME*A Adj Pr FDF Type III SS Mean Square F Value Pr F G-G H-F 3 0.15637500 0.05212500 0.59 0.6262 0.5746 0.6046,SAS输出结果,多变量重复资料的方差分析结果说明时间对体温有显著性影响;时间与处理方法之间对体温的交互影响不显著。,nth level of TIME and the 2ndContrast Variable:TIME.1
41、Source DF Type III SS F Value Pr FMEAN 1 31.25000000 107.92 0.0001A 1 0.09800000 0.34 0.5679Error 18 5.21200000Contrast Variable:TIME.3Source DF Type III SS F Value Pr FMEAN 1 25.08800000 138.78 0.0001A 1 0.01800000 0.10 0.7560Error 18 3.25400000Contrast Variable:TIME.4Source DF Type III SS F Value
42、Pr FMEAN 1 25.31250000 123.91 0.0001A 1 0.04050000 0.20 0.6614Error 18 3.67700000,SAS输出结果,时间点2和时间点1,3,4都有显著性差异。,General Linear Models ProcedureRepeated Measures Analysis of VarianceTests of Hypotheses for Between Subjects EffectsSource DF Type III SS F Value Pr FA 1 11.93512500 12.51 0.0024Error 18
43、17.17125000,SAS输出结果,多变量重复资料的方差分析结果说明处理方法对体温有显著性影响。,General Linear Models Procedure Least Squares MeansA T1 Pr|T|H0:LSMEAN LSMEAN1=LSMEAN21 31.3300000 0.02202 31.9600000A T2 Pr|T|H0:LSMEAN LSMEAN1=LSMEAN21 32.5100000 0.01372 33.2800000A T3 Pr|T|H0:LSMEAN LSMEAN1=LSMEAN21 31.3600000 0.00152 32.1900000
44、A T4 Pr|T|H0:LSMEAN LSMEAN1=LSMEAN21 31.3400000 0.00142 32.2000000,SAS输出结果,各个时间点上两个处理组之间体温有显著性差异。且随着时间的延长,差异的显著性越来越强。,医学研究中经常需要根据不同的要求制订出不同类型的研究设计,每一种研究设计收集到的数据资料能否找到一个恰当的统计分析方法来处理,这是很关键的一个问题。若处理不好,得到的结论可能不可靠,或没有实际意义,将会造成某种程度的浪费。因此,在选择用哪一种方差分析方法来处理实验或试验设计研究资料时,要非常认真考虑,并要结合本专业基础知识和实际经验,而不能生搬硬套。,注意事项,总 结,什么是方差分析以及方差分析的基本原理?方差分析对数据的假设条件?方差分析的分类?完全随机设计资料的单因子方差分析方法?随机区组设计资料的双因子方差分析方法?析因设计资料的多因子方差分析方法?拉丁方设计资料的三因子方差分析方法?嵌套设计、裂区设计和重复测量设计?,Thanks!,