《硕士计量101IntroductiontoEconometrics.ppt》由会员分享,可在线阅读,更多相关《硕士计量101IntroductiontoEconometrics.ppt(64页珍藏版)》请在三一办公上搜索。
1、Econometrics I Fall 2011,Instructor:冯强Office:博学 1223 Phone:6449-3318The best way to contact me is by email:,Brief Overview of the Course,Economics suggests interesting relations,often with policy implications,but virtually never suggests quantitative magnitudes of causal effects.For example:What is
2、the price elasticity of public transportation?Say,a 1yuan reduction in price,by how much can we expect the volume changes?What is the effect of reducing class size on student achievement?What is the effect on earnings of a year of education?What is the effect on GDP(or inflation)of a 1 percentage po
3、int increase in interest rates by the Central Bank?,Other Example of Econometric Studies,What might be the effect a new regulation on the housing price in Beijing?What will be the impact of eliminating residence requirement(户口)on wage rate for college graduate(preferable UIBE students)?The short ter
4、m or long term impact of the imposition征收 of odd奇/even偶 plates driving days for the demand for cars?What is the effect of GDP on electricity usage?What kind of problem are you interested in using econometrics to study?Can you come up with questions of this kind?,The focus of this course is the use o
5、f statistical and econometric methods to quantify causal effectsIdeally,we would like an experiment:Transit prices;class size;returns to education;Central Bank But almost always we must use observational 观测的(nonexperimental)data.Observational data poses major challenges:consider estimation of return
6、s to educationConfounding混淆 effects(omitted省略 factors,such as ability)simultaneous causality 同时因果关系The higher is the income,the more time one can afford to stay in school“correlation 相关性does not imply causation”High income of the Western world is correlated with their height,does that mean the talle
7、r is the people,the richer they are?,In this course you will:,Learn methods for estimating causal effects using observational data;Lean some basic theories behind the methods in econometricsLearn to produce(you do the analysis)and consume(evaluate the work of others)econometric applications;andPract
8、ice“producing”in your problem sets.,Causal Relations,Q:Which of the following has a causal relationship?Circumference 围and height of a treeNo causal relationship,but can be used for prediction neverthelessWage and outputBut higher wage can lead to higher moral and higher outputWeight and gas consump
9、tion of a truckBut energy efficient engine uses less gasCell phone fees and length of callsBut long distance calls costs more,Is there a relationship between wage level and mobility?,The difference between experimental data and observational data,Designed Experiment can easily test causal relationsh
10、ips,for example:The effect of a kind of fertilizer on tomato cropsThe effect of a medication on patients blood pressureThere is no simple designed experiment for social scienceFor each 1%increase in price,what is the percentage drop in transit volume?Q:How should we conduct such an experiment on the
11、 price elasticity of public transportation?Can each persons bus fair be determined randomly in Beijing,and see how the change in price affects a persons transit decision?,Types of data:,Cross-sectional data(截面数据)e.g.recordings of every students weight for todayTime series(时间序列)e.g.the weight record
12、of a person over a year.Panel data(面板数据)the combination of cross-section and time seriese.g.the weight records of all the students here for each day and for over a year.,The identification of data type:,Q:the data in the published 2010 Statistical Abstract of China is typically of what kind?A:Cross-
13、sectional,because it is different entities data for the same time periodQ:What kind of data is the published stock market activities?A:Time-series,for it is the realization of a variables value over time.,地 区年末人数(万)平均劳动报酬北 京514 39,684 天 津195 27,628 河 北501 16,456 山 西366 18,106 内蒙古243 18,382 辽 宁498 19
14、,365 吉 林266 16,393 黑龙江497 15,894 上 海333 37,585 江 苏679 23,657 浙 江611 27,570 安 徽338 17,610 福 建427 19,424 江 西283 15,370 山 东898 19,135 A)Cross SectionB)Time SeriesC)Panel D)Not Sure,What kind of data is this?,What kind of data is this?,年 GDP(养殖业制造业其他)19783645.21018.41745.2881.619804545.61359.42192.0994.
15、219859016.02541.63866.62607.8198610275.22763.94492.73018.6198712058.63204.35251.63602.7198815042.83831.06587.24624.6198916992.34228.07278.05486.3199018667.85017.07717.45933.4A)Cross SectionB)Time SeriesC)Panel D)Not Sure,What kind of data is this?,年GDP养殖业制造业其他19783645.21018.41745.2881.619804545.6135
16、9.42192.0994.219859016.02541.63866.62607.8198610275.22763.94492.73018.6198712058.63204.35251.63602.7198815042.83831.06587.24624.6198916992.34228.07278.05486.3199018667.85017.07717.45933.4A)Cross SectionB)Time SeriesC)Panel D)Not Sure,A)Cross SectionB)Time Series C)Panel D)Not Sure,What kind of data
17、is this?,Review of Probability and Statistics,Empirical problem:Class size and educational outcomePolicy question:What is the effect of reducing class size by one student per class?by 8 students/class?What is the right outcome measure(“dependent variable”)?parent satisfactionstudent personal develop
18、mentfuture adult welfare and/or earningsperformance on standardized tests,What do data say about the class size/test score relation?,The California Test Score Data SetAll K-6 and K-8 California school districts(n=420)地区Variables:5th grade test scores(Stanford-9 achievement test,combined math and rea
19、ding),district averageStudent-teacher ratio(STR)=no.of students in the district divided by no.full-time equivalent teachers 全职教师,Initial look at the data:(You should already know how to interpret this table),This table doesnt tell us anything about the relationship between test scores and the STR.,S
20、catterplot of test score v.student-teacher ratio,Do districts with smaller classes have higher test scores?What does this figure show?,How can we get some numerical evidence on whether districts with low STRs have higher test scores?,There are 3 related numerical measurements:Compare average test sc
21、ores in districts with low STRs to those with high STRs(“estimation”)Test the hypothesis that the mean test scores in the two types of districts are the same,against the alternative hypothesis that they differ(“hypothesis testing”)Estimate an interval for the difference in the mean test scores,high
22、v.low STR districts(“confidence interval”),Initial data analysis:Compare districts with“small”(STR 20)and“large”(STR 20)class sizes:,Estimation of(=difference between group means)Test the hypothesis that=0Construct a confidence interval for,1.Estimation,=657.4 650.0=7.4where and Is this a large diff
23、erence in a real-world sense?Standard deviation across districts=19.1Difference between 60th and 75th percentiles of test score distribution is 667.6 659.4=8.2Is this a big enough difference to be important for school reform discussions,for parents,or for a school committee?,2.Hypothesis testing,Dif
24、ference-in-means test:compute the t-statistic,(remember this?)where SE()is the“standard error”ofthe subscripts s and l refer to“small”and“large”STR districts;and(etc.),Compute the difference-of-means t-statistic:,Q:Can we reject the null hypothesis that=0?A:Yes,since|t|1.96,we can reject(at the 5%si
25、gnificance level)the null hypothesis that the two means are the same.,3.Confidence interval,A 95%confidence interval for the difference between the means is,()1.96SE()=7.4 1.961.83=(3.8,11.0)Q:Are the following two statements equivalent?The 95%confidence interval for doesnt include 0;The hypothesis
26、that=0 is rejected at the 5%level.A:Yes,they are.,This should all be familiarBut:,What is the underlying framework that justifies all this?Estimation:Why estimate by?Testing:What is the standard error of,really?Why reject=0 if|t|1.96?Confidence intervals(interval estimation):What is a confidence int
27、erval,really?,Review of Statistical Concepts,We will review the following in turnThe probability framework for statistical inferenceEstimationHypothesis testingConfidence Intervals,1.The probability framework for statistical inference,Here are some key concepts:PopulationRandom variable YPopulation
28、distribution of Y“Moments”of the population distributionConditional distributionsSimple random sampling,PopulationThe group or collection of entities of interestHere,“all possible”school districts“All possible”means“all possible”circumstances that lead to specific values of STR,test scoresWe will th
29、ink of populations as infinitely large;the task is to make inferences about a large population based on a sample from the population,Random variable YNumerical summary of a random outcomeHere,the numerical value of district average test scores(or district STR),once we choose a year/district to sampl
30、e.Population distribution of YThe probabilities of different values of Y that occur in the population,for ex.PrY=650(when Y is discrete)or:The probabilities of sets of these values,for ex.PrY 650(when Y is continuous).,总体分布实例:美国男女成人身高的(正态)分布,=175 cm=7.1 cm,身高(英寸),男性,女性,问:这两曲线里,哪个是男的,那个是女的?,问:为什么女的曲线
31、比男的高?,Normal Distribution Example,The height of the curve at x is determined by the function:,x,If x is distributed as a normal variable,then it is designated as:x N(,),There are an infinite number of normal curves,“Moments”of the population distribution,mean=expected value=E(Y)=Y=long-run average v
32、alue of Y over repeated realizations of Yvariance=var(Y)=E(Y Y)2=measure of the squared spread of the distributionstandard deviation=Y,Conditional distributionsThe distribution of Y,given value(s)of some other random variable,XEx:the distribution of test scores,given that STR 20Moments of conditiona
33、l distributionsconditional mean=mean of conditional distribution=E(Y|X=x)(important notation)Example:E(Test scores|STR 20),the mean of test scores for districts with small class sizes conditional variance=variance of conditional distribution=var(Y|X=x),The difference in means is the difference betwe
34、en the means of two conditional distributions:=E(Test scores|STR 20)E(Test scores|STR 20)Other examples of conditional means:Wages of all female workers(Y=wages,X=gender)One-year mortality rate of those given an experimental treatment(Y=live/die;X=treated/not treated)The conditional mean is a new te
35、rm for the familiar idea of the group average,Inference about means,conditional means,and differences in conditional means We would like to know(test score gap;gender wage gap;effect of experimental treatment),but we dont know it.Therefore we must collect and use data that permits making statistical
36、 inferences about from eitherExperimental data,or,if not possible,fromObservational data,Simple random samplingChoose an individual(district,entity)at random from the populationRandomness and dataPrior to sample selection,the value of Y for is random because the individual selected is randomOnce the
37、 individual is selected and the value of Y is observed,then Y is just a number not randomThe data set is(Y1,Y2,Yn),where Yi=value of Y for the ith individual(district,entity)sampled,Implications of simple random sampling,Because individuals#1 and#2 are selected at random,the value of Y1 has no infor
38、mation content for Y2.Thus:Y1,Y2 are independently distributedY1 and Y2 come from the same distribution,that is,Y1,Y2 are identically distributedThat is,a consequence of simple random sampling is that Y1 and Y2 are independently and identically distributed(i.i.d.).More generally,under simple random
39、sampling,the set of the sampled values Yi,i=1,n,are i.i.d,2.Estimation,Sample average is the natural estimator of the population mean.But:What are the properties of this estimator?Why should we use Y rather than some other estimator?For example:Y1(take simply the first observation)maybe unequal weig
40、hts not simple averagemedian(Y1,Yn)from a sample of size n,To answer these questions we need to characterize the sampling distribution of The individuals in the sample are drawn at random.Thus the values of(Y1,Yn)are randomThus functions of(Y1,Yn),such as,are random:had a different sample been drawn
41、,they would have taken on a different valueThe distribution of over different possible samples of size n is called the sampling distribution of.The mean and variance of are the mean and variance of its sampling distribution,E()and var().Related to var()is the idea of the covariance,The covariance be
42、tween r.v.s X and Z is,cov(X,Z)=E(X X)(Z Z)=XZThe covariance is a measure of the linear association between X and Z;its units are units of X times units of Zcov(X,Z)()0:X and Z positive(negative)relation between X and ZIf X and Z are independently distributed,then cov(X,Z)=0(but not vice versa!Why n
43、ot?)The covariance of a r.v.with itself is its variance:cov(X,X)=E(XX)(XX)=E(XX)2=,The correlation coefficient is defined in terms of the covariance:corr(X,Z)=rXZSome notes on correlation coefficient:1 corr(X,Z)1corr(X,Z)=1(-1)mean perfect positive(negative)linear associationcorr(X,Z)=0 means no lin
44、ear associationIf E(X|Z)=const(not a function of Z),then corr(X,Z)=0(not necessarily vice versa),The correlation coefficient measures linear association,The correlation coefficient measures linear association,Q:Corr(x,y)=0,but are x and y independent?,A:No!,The sampling distribution of,The individua
45、ls in the sample are drawn at random.Thus the values of(Y1,Yn)are randomThus functions of(Y1,Yn),such as,are random:had a different sample been drawn,they would have taken on a different valueSince each sample mean is different,there must be a distribution of the sample mean,or sampling distribution
46、The sampling distribution of is distribution of over different possible samples of size n.,(样本均值分布),Sampling distribution of the sample mean from a survey of siblings,The sampling distribution of,Demonstrations of the Sampling Distribution on the Web:http:/try both binomial and uniform distributions
47、 The mean and variance of are the mean and variance of its sampling distribution,E()and var().To compute var(),we need the covariance(协方差),The mean and variance of the sampling distribution of mean:Variance:,So,Why?,Yi and Yj are independent,Summary:E()=Y and var()=.Implications:is an unbiased estim
48、ator of Y(that is,E()=Y)var()is inversely proportional to nspread of sampling distribution is proportional to 1/in this sense,the sampling uncertainty arising from using to make inferences about Y is proportional to 1/,For small sample sizes,the distribution of is complicated.BUT:when n is large,it
49、is not!(1)As n increases,the distribution of becomes more tightly centered around Y:the sampling uncertainty decreases as n increases(recall that var()=/n)An estimator is consistent if the probability that its falls within an interval of the true population value tends to one as the sample size incr
50、eases.,The Law of Large Numbers:,If(Y1,Yn)are i.i.d.and,then is a consistent estimator of Y,that is,Pr|Y|1 as n which can be written as:Y(“converges in probability to Y”)(Proof:as n,var()=0,which implies that Pr|Y|1.),(2)Central limit theorem(CLT),If(Y1,Yn)are i.i.d.and 0,then when n is large the di