《结构方程模型lecture2.ppt》由会员分享,可在线阅读,更多相关《结构方程模型lecture2.ppt(35页珍藏版)》请在三一办公上搜索。
1、SEM的方法基础,回归分析,CFA模型,路径分析,EFA模型,2,路径分析的基本方法,路径分析的局限,验证性因子分析的基本方法,本讲内容,关于验证性因子分析的建议,3,假定自变量线性无关可用来检验变量之间的直接关系,经典回归分析,X,Z,Y,4,对于下面的关系,怎么办?,X,Z,Y,5,路径分析,Path analysis is a straightforward extension of multiple regression.Its aim is to provide estimates of the magnitude and significance of hypothesized c
2、ausal connections between sets of variables.path diagraminput path diagram:is drawn beforehand to help plan the analysis and represents the causal connections that are predicted by the hypothesis.output path diagram:represents the results of a statistical analysis,and shows what was actually found.,
3、6,路径分析示例,对锻炼的态度,锻炼数量,食物摄取量,体重下降数量,+,+,+,-,对锻炼的态度,锻炼数量,食物摄取量,体重下降数量,0.31,0.30,0.15,-0.44,如果因果关系不明确,怎么办?,双向箭头或曲线,0.20,相关系数,7,路径分析的基本类型:递归模型,路径模型中变量之间只有单向的因果关系,没有直接或间接的反馈,而且所有的误差项不相关路径图中没有环,误差项之间没有双向(弧线)箭头假定:变量间为线性、可加的因果关系每一内生变量的误差项与外生变量无关,与其他外生变量的误差项无关因果关系不包括反馈作用变量为定量变量变量自身的测量不存在误差,8,路径分析的基本类型:非递归模型,至
4、少符合以下条件之一模型中任意两个变量之间存在直接或间接的反馈作用某变量存在自身反馈作用(自相关)误差项相关一个结果变量的误差项与其原因变量相关不同变量的误差项相关路径图中有环,误差项之间有双向(弧线)箭头,9,路径模型的协方差阵,用样本协方差阵S代替可建立协方差方差从而估计模型参数,10,路径模型整体的识别,设模型参数的个数为t内生变量数:p外生变量数:qS中的元素个数t规则(必要条件):tv零B规则(充分条件):B=0递归规则(充分条件):递归模型可识别,11,非递归路径模型单个方程的识别,阶条件(必要条件):若第i个方程未包括的内生变量和外生变量数之和必须大于或等于p-1 秩条件(充分条件
5、):将矩阵G中第i行的非零元素所在的列划掉,剩余阵Gi的秩Rank(Gi)=p-1如果每个方程都可识别,则非递归路径模型可识别,12,例:判断可识别性,X1,Y1,X2,Y2,21,11,22,12,21,1,2,X1,Y1,X3,Y2,13,11,23,12,21,X2,1,2,12,23,12,21,13,路径分析的步骤,模型设定参数估计递归模型:OLS非递归效应:ML/LS/GLS模型检验与评价效应分解因果效应:变量之间由于存在因果关系而产生的影响作用直接效应/间接效应虚假效应:两个内生变量的相关系数中,由于共同的起因产生影响作用的部分未析效应:一个外生变量与一个内生变量的相关系数中,除
6、去直接效应和间接效应外剩余的部分,14,例:效应分解,X1,Y1,X2,Y2,21,11,22,12,21,2,1,15,Blau和Duncan(1967)的社会分层研究,According to Marxist scholar of the time,the US is a highly stratified society.Status is determined by family background and transmitted through the school system.Questions:How and to what degree do the circumstanc
7、es of birth condition subsequent status?How does status attained(whether by ascription or achievement)at one stage of the life cycle affect the prospects for a subsequent stage?,16,Path model.Stratification,US,1962,Duncans prestige scale(096)考虑了收入、教育和工作声望,解释一下af的含义有没有问题?,17,路径分析的假定,X,X,Y,Z,Y=a+b+,Z=
8、c+d+e+,因果机制,连接,You seem to be free to use your own xs and ys,rather than the ones generated by Nature,as inputs.,Potential outcome,18,Selection vs intervention,路径分析连接了两种很不相同的关于条件期望的思想selecting the subjects with X=xintervening to set X=x,通常进行回归分析的目的是不做实际试验而完成因果推断。但是,没有实际试验,模型背后的假定是可疑的,这样得到的推断是忽略了假定的可
9、疑性而做的,这就是由回归做因果推断的自相矛盾之处。路径模型并不从关联推断出因果,实际上,路径模型假定了因果关系,并利用附加的统计假定从观察数据来估计因果关联。,19,Duncan(1984)的一席话,Coupled with downright incompetence in statistics,paradoxically,we often find the syndrome that I have come to call statisticism:the notion that computing is synonymous with dong research,the nave fai
10、th that statistics is a complete or sufficient basis for scientific methodology,the superstition that statistical formulas exist for evaluating such things as the relative merits of different substantive theories or the importance of the causes of a dependent variable;and the delusion that decomposi
11、ng the covariations of some arbitrary and haphazardly assembled collection of variables can somehow justify not only a causal model but also,praise the mark,a measurement model.There should be no point in deploring such caricatures of the scientific enterprise if there were a clearly identifiable se
12、ctor of social science research wherein such fallacies were clearly recognized and emphatically out of bounds.,20,因子,Factors are influences that are not directly measured but account for commonality among a set of measurements.,X1,X2,X3,F,X1,X2,X3,u1,u2,u3,F,21,探索性因子分析,将可测随机向量与潜在因子连接起来的线性模型目的:寻找少数几个
13、因子,以解释观察变量之间的相关性没有如下先验知识公因子数因子载荷因子间关系假定误差项无关,提出假设:公因子数 因子载荷放松假定:和为对称阵,验证性因子分析,22,EFA与CFA的关系,相同点:模型相同CFA类似于EFA的简单结构不同点:用途不同EFA在于探索(归纳)CFA在于检验(演绎)假定不同估计方法不同EFA常采用谱分解CFA常采用MLE、GLS等,1,X1,X2,1,X3,X4,2,3,4,12,2,1,1,42,21,31,32,12,11,41,22,EFA,CFA,23,CFA的可识别性,参数个数:t=qn+n(n+1)/2+q(q+1)/2x中元素:qn中元素:n(n+1)/2中
14、元素:q(q+1)/2方程个数:v=q(q+1)/2必须对()中的参数施加约束,CFA方可识别参考变量法(reference variable solution):设x中每一列至少有一个ij=1标准化法(standardization solution):设潜变量的方差等于1其他:设某些ij=0;设某些ij相等;设某些参数等于给定的数,经验对于参考变量法:x中每一列至少有一个ij=1 其他行有且只有一个非零元素 每个因子至少有三个指标 为对角阵 对不做假定对于标准化法:的对角元素为1 x的每一个元素都不为1,24,例:三因子验证性因子模型,1,X1,X2,1,X3,X4,X5,2,3,4,5,
15、12,X6,X7,X8,6,7,8,2,3,23,13,1,1,42,1,21,63,73,83,标准化法如何设模型?,25,CFA的分析步骤,模型设定参数估计选择拟合函数(fit function)F(S,()拟合函数最小化模型的假设检验模型拟合优度评价模型修正最好本着简约的原则,移除而非添加路径模型选择,26,CFA例:Performance Assessment Program,研究目标:检验PAP的效果五个维度Teachers support for the PAPTeachers emphasis on outcome/change in instruction and assess
16、mentTeachers familiarity with PAPPAPs impact on instruction/assessmentPAPs impact on professional development八个指标(4级Likert量表)265个教师接受调查,27,EFA输出结果:四因子,Call:factanal(factors=4,covmat=cov)Uniquenesses:V1 V2 V3 V4 V5 V6 V7 V8 0.537 0.138 0.492 0.411 0.378 0.388 0.616 0.135 Loadings:Factor1 Factor2 Fact
17、or3 Factor41,0.210 0.638 2,0.897 0.207 3,0.232 0.654 0.157 4,0.198 0.737 5,0.747 0.120 0.184 0.128 6,0.612 0.313 0.340 0.157 7,0.543 0.107 0.202 0.192 8,0.299 0.137 0.869 Factor1 Factor2 Factor3 Factor4SS loadings 1.460 1.348 1.190 0.909Proportion Var 0.182 0.168 0.149 0.114Cumulative Var 0.182 0.35
18、1 0.500 0.613The degrees of freedom for the model is 2 and the fit was 0.0034,28,EFA输出结果:三因子,Call:factanal(factors=3,covmat=cov,rotation=varimax)Uniquenesses:V1 V2 V3 V4 V5 V6 V7 V8 0.005 0.613 0.634 0.005 0.492 0.378 0.600 0.669 Loadings:Factor1 Factor2 Factor31,0.996 2,0.251 0.564 3,0.366 0.476 4,
19、0.169 0.980 5,0.660 0.207 0.173 6,0.680 0.298 0.267 7,0.598 0.122 0.165 8,0.569 Factor1 Factor2 Factor3SS loadings 1.805 1.467 1.331Proportion Var 0.226 0.183 0.166Cumulative Var 0.226 0.409 0.575The degrees of freedom for the model is 7 and the fit was 0.0901,29,CFA输出结果:三因子(pp.23),Estimate Std Erro
20、r z value Pr(|z|)lam21 1.30574 0.22995 5.6783 1.3607e-08 X2 f1phi13 8.90354 1.99314 4.4671 7.9288e-06 f3 f1phi23 9.52657 1.58466 6.0118 1.8352e-09 f3 f2delta1 21.22005 3.53395 6.0046 1.9177e-09 X1 X1delta2 12.76266 5.25386 2.4292 1.5132e-02 X2 X2delta3 9.06085 1.77062 5.1173 3.0989e-07 X3 X3delta4 1
21、2.33760 1.88532 6.5440 5.9882e-11 X4 X4delta5 18.25313 2.09161 8.7268 0.0000e+00 X5 X5delta6 12.03134 2.00074 6.0135 1.8161e-09 X6 X6delta7 30.21710 3.09192 9.7729 0.0000e+00 X7 X7delta8 47.34513 4.49991 10.5214 0.0000e+00 X8 X8phi11 18.37994 4.19164 4.3849 1.1604e-05 f1 f1phi22 12.33890 2.31616 5.3
22、273 9.9677e-08 f2 f2phi33 17.74676 3.02662 5.8636 4.5307e-09 f3 f3,Model Chisquare=42.394 Df=17 Pr(Chisq)=0.00058827 Chisquare(null model)=598.95 Df=28 Goodness-of-fit index=0.96173 Adjusted goodness-of-fit index=0.91895 RMSEA index=0.07522 90%CI:(0.047104,0.10396)Bentler-Bonnett NFI=0.92922 Tucker-
23、Lewis NNFI=0.92675 Bentler CFI=0.95552 SRMR=0.040583 BIC=-52.462 Normalized Residuals Min.1st Qu.Median Mean 3rd Qu.Max.-2.36e+00-3.43e-01 5.37e-05-6.59e-02 2.33e-01 9.25e-01,30,CFA输出结果:三因子,Estimate Std Error z value Pr(|z|)lam11 4.28726 0.488333 8.7794 0.0000e+00 X1 f1phi13 0.49298 0.068025 7.2471
24、4.2566e-13 f3 f1phi23 0.64378 0.060265 10.6825 0.0000e+00 f3 f2delta1 21.21941 3.532160 6.0075 1.8842e-09 X1 X1delta2 12.76251 5.251298 2.4304 1.5084e-02 X2 X2delta3 9.06100 1.770171 5.1187 3.0763e-07 X3 X3delta4 12.33743 1.885110 6.5447 5.9625e-11 X4 X4delta5 18.25358 2.091432 8.7278 0.0000e+00 X5
25、X5delta6 12.03125 2.000661 6.0136 1.8140e-09 X6 X6delta7 30.21706 3.091902 9.7730 0.0000e+00 X7 X7delta8 47.34513 4.499884 10.5214 0.0000e+00 X8 X8,Model Chisquare=42.394 Df=17 Pr(Chisq)=0.00058827 Chisquare(null model)=598.95 Df=28 Goodness-of-fit index=0.96173 Adjusted goodness-of-fit index=0.9189
26、5 RMSEA index=0.07522 90%CI:(0.047104,0.10396)Bentler-Bonnett NFI=0.92922 Tucker-Lewis NNFI=0.92675 Bentler CFI=0.95552 SRMR=0.040584 BIC=-52.462 Normalized Residuals Min.1st Qu.Median Mean 3rd Qu.Max.-2.36e+00-3.43e-01-2.05e-06-6.60e-02 2.33e-01 9.25e-01,31,CFA对数据的要求,样本容量应考虑模型复杂性、估计方法、分布特征、测量尺度仅考虑模
27、型复杂性:每个待估参数至少需要4个样本点通常建议,CFA的样本容量至少为200显现出渐进性质:400利用修正指数来改进模型:800分布特征:联合正态分布如果模型正确、样本容量足够大,MLE稳健性较好如果极端非正态,需要用渐进分布无关(asymptotic distribution free)方法或Satorra-Bentler稳健统计量测量尺度:连续尺度指标类型结果(effect)/反映性(reflective)原因(cause)/形成性(formative),32,7 Recommendations for CFA,Users should aim for samples of at lea
28、st 200,and,preferably,400 cases.If more than minimal respecification of an hypothesized model is anticipated,then a sample of at least 800 cases is necessary.The distributional properties of indicators should be well understood and corrective measures taken(e.g.,transformations,parceling,scaled stat
29、istics)when distributions depart markedly from normality.At last three and,preferably,four indicators of factors should be obtained.,33,7 Recommendations for CFA(cont.),Simple structure should not be assumed in all models.With sufficient indicators per factor,cross-loadings are permissible and may b
30、e an important feature of a model.The multifaceted nature of fit evaluation should e acknowledged by consulting two or more indicators of fit that rely on different computational logic.Whenever possible,multiple,nested models should be posited in order to rule out parsimonious or substantively inter
31、esting alternatives to the hypothesized model.,34,7 Recommendations for CFA(cont.),Respecification is not to be eschewed,but it should be undertaken in a disciplined manner with due attention to the possibility of Type I errors.Substantially respecified models should be cross-validated in an independent sample.,35,课后任务,阅读文献Sewall Wright:The theory of path coefficients reply to Niless criticism,Genetics,1923(8):239-255在R中下载sem,练习数据导入和sem的基本操作对于PAP一例,设定其他模型形式,进行CFA分析,