market research, correlationandregression.ppt

上传人:仙人指路1688 文档编号:2380915 上传时间:2023-02-16 格式:PPT 页数:54 大小:590.50KB
返回 下载 相关 举报
market research, correlationandregression.ppt_第1页
第1页 / 共54页
market research, correlationandregression.ppt_第2页
第2页 / 共54页
market research, correlationandregression.ppt_第3页
第3页 / 共54页
market research, correlationandregression.ppt_第4页
第4页 / 共54页
market research, correlationandregression.ppt_第5页
第5页 / 共54页
点击查看更多>>
资源描述

《market research, correlationandregression.ppt》由会员分享,可在线阅读,更多相关《market research, correlationandregression.ppt(54页珍藏版)》请在三一办公上搜索。

1、Chapter Eighteen,Chapter 18,Figure 18.1 Relationship of Correlation and Regression to the Previous Chapters and the Marketing Research Process,Focus of This Chapter,Relationship toPrevious Chapters,Relationship to MarketingResearch Process,CorrelationRegression,Analytical Framework and Models(Chapte

2、r 2)Data Analysis Strategy(Chapter 15)General Procedure of Hypothesis Testing(Chapter 16)Hypothesis Testing Related to Differences(Chapter 17),Problem Definition,Approach to Problem,Field Work,Data Preparation and Analysis,Report-Preparationand Presentation,Research Design,Figure 18.1 Relationship t

3、o the Previous Chapters&The Marketing Research Process,Application to Contemporary Issues,Technology,Ethics,International,Be a DM!Be an MR!Experiential Learning,Opening Vignette,What Would You Do?,Product Moment Correlation,Regression Analysis,Bivariate Regression,Figs 18.3-18.4,Tables 18.1-18.2,Tab

4、le 18.3,Figs 18.5-18.7,Multiple Regression,Table 18.4,Figure 18.2 Correlation and Regression:An Overview,Product Moment Correlation,The product moment correlation,r,summarizes the strength of association between two metric(interval or ratio scaled)variables,say X and Y.It is an index used to determi

5、ne whether a linear or straight-line relationship exists between X and Y.As it was originally proposed by Karl Pearson,it is also known as the Pearson correlation coefficient.It is also referred to as simple correlation,bivariate correlation,or merely the correlation coefficient.,Product Moment Corr

6、elation,Product Moment Correlation,r varies between-1.0 and+1.0.The correlation coefficient between two variables will be the same regardless of their underlying units of measurement.,Table 18.1 Explaining Attitude Toward Sports Cars,Plot of Attitude with Duration,Figure 18.3,4.5,2.25,6.75,11.25,9,1

7、3.5,9,3,6,15.75,18,Duration of Car Ownership,Attitude,Product Moment Correlation,=(10+12+12+4+12+6+8+2+18+9+17+2)/12=9.333,=(6+9+8+3+10+4+5+2+11+9+10+2)/12=6.583,=(10-9.33)(6-6.58)+(12-9.33)(9-6.58)+(12-9.33)(8-6.58)+(4-9.33)(3-6.58)+(12-9.33)(10-6.58)+(6-9.33)(4-6.58)+(8-9.33)(5-6.58)+(2-9.33)(2-6.

8、58)+(18-9.33)(11-6.58)+(9-9.33)(9-6.58)+(17-9.33)(10-6.58)+(2-9.33)(2-6.58)=-0.3886+6.4614+3.7914+19.0814+9.1314+8.5914+2.1014+33.5714+38.3214-0.7986+26.2314+33.5714=179.6668,=(10-9.33)2+(12-9.33)2+(12-9.33)2+(4-9.33)2+(12-9.33)2+(6-9.33)2+(8-9.33)2+(2-9.33)2+(18-9.33)2+(9-9.33)2+(17-9.33)2+(2-9.33)

9、2=0.4489+7.1289+7.1289+28.4089+7.1289+11.0889+1.7689+53.7289+75.1689+0.1089+58.8289+53.7289=304.6668,=(6-6.58)2+(9-6.58)2+(8-6.58)2+(3-6.58)2+(10-6.58)2+(4-6.58)2+(5-6.58)2+(2-6.58)2+(11-6.58)2+(9-6.58)2+(10-6.58)2+(2-6.58)2=0.3364+5.8564+2.0164+12.8164+11.6964+6.6564+2.4964+20.9764+19.5364+5.8564+1

10、1.6964+20.9764=120.9168,Thus,=0.9361,Decomposition of the Total Variation,Calculation of r,Decomposition of the Total Variation,When it is computed for a population rather than a sample,the product moment correlation is denoted by,the Greek letter rho.The coefficient r is an estimator of.The statist

11、ical significance of the relationship between two variables measured by using r can be conveniently tested.The hypotheses are:,The test statistic is:,which has a t distribution with n-2 degrees of freedom.For the correlation coefficient calculated based on the data given in Table 18.1,=8.414and the

12、degrees of freedom=12-2=10.From the t distribution table(Table 4 in the Statistical Appendix),the critical value of t for a two-tailed test and,=0.05 is 2.228.Hence,the null hypothesis of no relationship between X and Y is rejected.,Figure 18.4 A Nonlinear Relationship for Which r=0,-3,-2,-1,0,1,2,3

13、,.,.,.,.,.,.,.,0,1,2,3,4,5,6,Regression Analysis,Regression analysis is used in the following ways:Determine whether the independent variables explain a significant variation in the dependent variable:whether a relationship exists.Determine how much of the variation in the dependent variable can be

14、explained by the independent variables:strength of the relationship.Determine the structure or form of the relationship:the mathematical equation relating the independent and dependent variables.Predict the values of the dependent variable.Control for other independent variables when evaluating the

15、contributions of a specific variable or set of variables.Regression analysis is concerned with the nature and degree of association between variables and does not imply or assume any causality.,Statistics:Bivariate Regression Analysis,Bivariate regression model.The basic regression equation is Yi=+X

16、i+ei,where Y=dependent or criterion variable,X=independent or predictor variable,=intercept of the line,=slope of the line,and ei is the error term associated with the i th observation.Coefficient of determination.The strength of association is measured by the coefficient of determination,r 2.It var

17、ies between 0 and 1 and signifies the proportion of the total variation in Y that is accounted for by the variation in X.Estimated or predicted value.The estimated or predicted value of Yi is i=a+b x,where i is the predicted value of Yi,and a and b are estimators of and,respectively.,Statistics:Biva

18、riate Regression Analysis,Regression coefficient.The estimated parameter b is usually referred to as the non-standardized regression coefficient.Scattergram.A scatter diagram,or scattergram,is a plot of the values of two variables for all the cases or observations.Standard error of estimate.This sta

19、tistic,SEE,is the standard deviation of the actual Y values from the predicted values.Standard error.The standard deviation of b,SEb,is called the standard error.,Statistics:Bivariate Regression Analysis,Standardized regression coefficient.Also termed the beta coefficient or beta weight,this is the

20、slope obtained by the regression of Y on X when the data are standardized.Sum of squared errors.The distances of all the points from the regression line are squared and added together to arrive at the sum of squared errors,which is a measure of total error,.t statistic.A t statistic with n-2 degrees

21、 of freedom can be used to test the null hypothesis that no linear relationship exists between X and Y,or H0:=0,where,Conducting Bivariate Regression AnalysisPlot the Scatter Diagram,A scatter diagram,or scattergram,is a plot of the values of two variables for all the cases or observations.The most

22、commonly used technique for fitting a straight line to a scattergram is the least-squares procedure.In fitting the line,the least-squares procedure minimizes the sum of squared errors,.,Conducting Bivariate Regression Analysis,Fig.18.5,Formulate the Bivariate Regression Model,In the bivariate regres

23、sion model,the general form of astraight line is:Y=X,+,whereY=dependent or criterion variableX=independent or predictor variable,=intercept of the line,=slope of the lineThe regression procedure adds an error term to account for the probabilistic or stochastic nature of the relationship:Yi=,+,Xi+eiw

24、here ei is the error term associated with the i th observation.,Figure 18.6 Bivariate Regression,b0+b1 X,Figure 18.6 Bivariate Regression,Y,are unknown and are estimated from the sample observations using the equation,where i is the estimated or predicted value of Yi,anda and b are estimators of,Est

25、imate the Parameters,In most cases,and,i=a+b xi,and,respectively.,-b,Estimate the Parameters,The intercept,a,may then be calculated using:a=,For the data in Table 18.1,the estimation of parameters may be illustrated as follows:,S,1,2,XiYi=(10)(6)+(12)(9)+(12)(8)+(4)(3)+(12)(10)+(6)(4)+(8)(5)+(2)(2)+

26、(18)(11)+(9)(9)+(17)(10)+(2)(2)=917,S,i,=,1,1,2,Xi2=102+122+122+42+122+62+82+22+182+92+172+22=1350,=,i,1,Estimate the Parameters,It may be recalled from earlier calculations of the simple correlation that,=9.333,=6.583 Given n=12,b can be calculated as:,=0.5897a=,-b,=6.583-(0.5897)(9.333)=1.0793,Sta

27、ndardization is the process by which the raw data are transformed into new variables that have a mean of 0 and a variance of 1(Chapter 13).When the data are standardized,the intercept assumes a value of 0.The term beta coefficient or beta weight is used to denote the standardized regression coeffici

28、ent.Byx=Bxy=rxy There is a simple relationship between the standardized and non-standardized regression coefficients:Byx=byx(Sx/Sy),Estimate the Standardized Regression Coefficient,Test for Significance,The statistical significance of the linear relationshipbetween X and Y may be tested by examining

29、 thehypotheses:A t statistic with n-2 degrees of freedom can beused,where SEb denotes the standard deviation of b and is calledthe standard error.,Test for Significance,Using a computer program,the regression of attitude on durationof residence,using the data shown in Table 18.1,yielded theresults s

30、hown in Table 18.2.The intercept,a,equals 1.0793,andthe slope,b,equals 0.5897.Therefore,the estimated equationis:Attitude()=1.0793+0.5897(Duration of Car Ownership)The standard error,or standard deviation of b is estimated as0.07008,and the value of the t statistic as t=0.5897/0.0700=8.414,with n-2=

31、10 degrees of freedom.From Table 4 in the Statistical Appendix,we see that the criticalvalue of t with 10 degrees of freedom and=0.05 is 2.228 fora two-tailed test.Since the calculated value of t is larger thanthe critical value,the null hypothesis is rejected.,Determine Strength and Significance of

32、 Association,The total variation,SSy,may be decomposed into the variationaccounted for by the regression line,SSreg,and the error or residualvariation,SSerror or SSres,as follows:SSy=SSreg+SSreswhere,Figure 18.7 Decomposition of the Total Variation In Bivariate Regression,X,Y,Total variation,SSY,Res

33、idual variation,SS RES,Explained variation,SS REG,Y,Determine the Strength and Significance of Association,To illustrate the calculations of r2,let us consider again the effect of attitudetoward the city on the duration of residence.It may be recalled from earliercalculations of the simple correlati

34、on coefficient that:,=120.9168,r,2,=,S,S,r,e,g,S,S,y,Determine the Strength and Significance of Association,The predicted values()can be calculated using the regressionequation:Attitude()=1.0793+0.5897(Duration of Car Ownership)For the first observation in Table 17.1,this value is:()=1.0793+0.5897 x

35、 10=6.9763.For each successive observation,the predicted values are,in order,8.1557,8.1557,3.4381,8.1557,4.6175,5.7969,2.2587,11.6939,6.3866,11.1042,and 2.2587.,Determine the Strength and Significance of Association,Therefore,=(6.9763-6.5833)2+(8.1557-6.5833)2+(8.1557-6.5833)2+(3.4381-6.5833)2+(8.15

36、57-6.5833)2+(4.6175-6.5833)2+(5.7969-6.5833)2+(2.2587-6.5833)2+(11.6939-6.5833)2+(6.3866-6.5833)2+(11.1042-6.5833)2+(2.2587-6.5833)2=0.1544+2.4724+2.4724+9.8922+2.4724+3.8643+0.6184+18.7021+26.1182+0.0387+20.4385+18.7021=105.9524,Determine the Strength and Significance of Association,=(6-6.9763)2+(9

37、-8.1557)2+(8-8.1557)2+(3-3.4381)2+(10-8.1557)2+(4-4.6175)2+(5-5.7969)2+(2-2.2587)2+(11-11.6939)2+(9-6.3866)2+(10-11.1042)2+(2-2.2587)2=14.9644It can be seen that SSy=SSreg+Ssres.Furthermore,r 2=Ssreg/SSy=105.9524/120.9168=0.8762,Determine the Strength and Significance of Association,Another equivale

38、nt test for examining the significance of the linear relationship between X and Y(significance of b)is the test for the significance of the coefficient of determination.The hypotheses in this case are:H0:R2pop=0H1:R2pop 0,Determine the Strength and Significance of Association,The appropriate test st

39、atistic is the F statistic,which has an F distribution with 1 and n-2 degrees of freedom.The F test is a generalized form of the t test(see Chapter 17).If a random variable is t distributed with n degrees of freedom,then t2 is F distributed with 1 and n degrees of freedom.Hence,the F test for testin

40、g the significance of the coefficient of determination is equivalent to testing the following hypotheses:or,Determine the Strength and Significance of Association,From Table 18.2,it can be seen that:r2=105.9522/(105.9522+14.9644)=0.8762 which is the same as the value calculated earlier.The value of

41、the F statistic is:F=105.9522/(14.9644/10)=70.8027 with 1 and 10 degrees of freedom.The calculated F statistic exceeds the critical value of 4.96 determined from Table 5 in the Statistical Appendix.Therefore,the relationship is significant,corroborating the results of the t test.,Table 18.2 Bivariat

42、e Regression,Check Prediction Accuracy,n,i,i,i,To estimate the accuracy of predicted values,it is useful to calculate the standard error of estimate,SEE.,From Table 18.2,=1.2233,Assumptions,The error term is normally distributed.For each fixed value of X,the distribution of Y is normal.The means of

43、all these normal distributions of Y,given X,lie on a straight line with slope b.The mean of the error term is 0.The variance of the error term is constant.This variance does not depend on the values assumed by X.The error terms are uncorrelated.In other words,the observations have been drawn indepen

44、dently.,Multiple Regression,The general form of the multiple regression modelis as follows:which is estimated by the following equation:=a+b1X1+b2X2+b3X3+.+bkXk As before,the coefficient a represents the intercept,but the bs are now the partial regression coefficients.,Statistics Associated with Mul

45、tiple Regression,Adjusted R2.R2,coefficient of multiple determination,is adjusted for the number of independent variables and the sample size to account for the diminishing returns.After the first few variables,the additional independent variables do not make much contribution.Coefficient of multipl

46、e determination.The strength of association in multiple regression is measured by the square of the multiple correlation coefficient,R2,which is also called the coefficient of multiple determination.F test.The F test is used to test the null hypothesis that the coefficient of multiple determination

47、in the population,R2pop,is zero.This is equivalent to testing the null hypothesis.The test statistic has an F distribution with k and(n-k-1)degrees of freedom.,Statistics Associated with Multiple Regression,Partial F test.The significance of a partial regression coefficient,of Xi may be tested using

48、 an incremental F statistic.The incremental F statistic is based on the increment in the explained sum of squares resulting from the addition of the independent variable Xi to the regression equation after all the other independent variables have been included.Partial regression coefficient.The part

49、ial regression coefficient,b1,denotes the change in the predicted value,Y,per unit change in X1 when the other independent variables,X2 to Xk,are held constant.,Partial Regression Coefficients,=a+b1X1+b2X2First,note that the relative magnitude of the partial regression coefficient of an independent

50、variable is,in general,different from that of its bivariate regression coefficient.The interpretation of the partial regression coefficient,b1,is that it represents the expected change in Y when X1 is changed by one unit but X2 is held constant or otherwise controlled.Likewise,b2 represents the expe

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 建筑/施工/环境 > 项目建议


备案号:宁ICP备20000045号-2

经营许可证:宁B2-20210002

宁公网安备 64010402000987号