S2.4a Sampling and hypothesis tests.ppt

上传人:laozhun 文档编号:2878408 上传时间:2023-02-28 格式:PPT 页数:58 大小:1.86MB
返回 下载 相关 举报
S2.4a Sampling and hypothesis tests.ppt_第1页
第1页 / 共58页
S2.4a Sampling and hypothesis tests.ppt_第2页
第2页 / 共58页
S2.4a Sampling and hypothesis tests.ppt_第3页
第3页 / 共58页
S2.4a Sampling and hypothesis tests.ppt_第4页
第4页 / 共58页
S2.4a Sampling and hypothesis tests.ppt_第5页
第5页 / 共58页
点击查看更多>>
资源描述

《S2.4a Sampling and hypothesis tests.ppt》由会员分享,可在线阅读,更多相关《S2.4a Sampling and hypothesis tests.ppt(58页珍藏版)》请在三一办公上搜索。

1、,These icons indicate that teachers notes or useful web addresses are available in the Notes Page.,This icon indicates the slide contains activities created in Flash.These activities are not editable.,For more detailed instructions,see the Getting Started presentation.,Boardworks Ltd 2006,1 of 58,A2

2、-Level Maths:Statistics 2for OCR,S2.4a Sampling and hypothesis tests,Contents,Boardworks Ltd 2006,2 of 58,Introduction to sampling,Introduction to sampling Sampling from a normal distributionCalculating from samplesUnbiased estimatesHypothesis testing on binomial dataChocolate tasting practicalOne-s

3、ided hypothesis tests on binomial dataOne-sided versus two-sided testsCritical regions,The British government carries out a census of the entire population of the United Kingdom every 10 years(most recently in April 2001).The first census in the United Kingdom was carried out in 1086 with the constr

4、uction of the Doomesday Book.However they have only been conducted on a regular basis since 1801.The census provides the government with a detailed picture of the population living in each part of the country(town,city or countryside).The results are used to help plan public services(health,housing,

5、transport and education)for the future.,National census,In statistics we often want to obtain information from a group of individuals or about a group of objects.,Introduction to sampling,A sampling frame is a list of all members of the population.,A census is an investigation in which information i

6、s obtained from every member of the population.,The population is the set of all individuals or objects that we wish to study.,Introduction to sampling,Examples:,A head teacher is interested in finding out how long her sixth form students spend in part-time employment each week.The population is the

7、 set of all sixth form students in her school.A possible sampling frame would be the registers of sixth form tutor groups.2.A newspaper is interested in obtaining the views of residents living close to the site of a proposed new airport.The population might be all adults living within a 10 mile radi

8、us of the site.A possible sampling frame could be the local electoral roll.,Examples:,3.A car company has discovered a fault that affects one of their models of car.The company may wish to know how widespread the problem might be.The population would be all cars produced of this particular model.A p

9、ossible sampling frame would be a list of all registered cars of this model provided by the DVLA.,Introduction to sampling,Carrying out a census of the entire population is usually not feasible or sensible.,Introduction to sampling,money time resources,In addition,some investigations could result in

10、 the destruction of the entire population!For example,if a light bulb manufacturer wished to investigate the lifetime of its bulbs,a census would result in the destruction of all the bulbs it produced.,A census is usually costly in terms of,Instead of surveying the whole population,information can i

11、nstead be obtained from a sample.The sampling process should be undertaken carefully to ensure that the sample is representative of the entire population.Bias can occur if one section of the population is over-or under-represented.,Introduction to sampling,Question:A local council wishes to know the

12、 views of local people on public transport.Criticize each of the following sampling regimes:,Ask the people waiting at the town centre bus stop.Leave questionnaires in local libraries for people to fill in.Ask people at the shopping centre on a Thursday morning.,One way to obtain a fair sample is to

13、 use random sampling.This method gives every member of the population an equal chance of being chosen for the sample.A more formal definition of a random sample is as follows:,There are a number of ways in which a random sample can be chosen.One commonly used technique is to use random number tables

14、.,Sampling methods,A sample of size n is called a random sample if every possible selection of size n has the same probability of being chosen.,The table below gives a list of random digits:,259 976 452 401 234 393 053 225 197 549 628 444 212 885 355 169 905 193 439 102 356 206 753 335 713 416 584 4

15、38 085 966 235 418 626 411469 807 561 925 290 692 923 229 288 631 523 040 940 642 775 838 281 475,Here is how to use random digits to obtain a sample:,Random number tables,Example:A sample of size 15 is required from a population of size 300.One possible approach would be to obtain a sampling frame

16、for the population and number every member from 001 to 300.You could then obtain chains of 3 random digits from tables.If the chain corresponds to a number between 001 and 300 you could select that member of the population;otherwise you could discard that chain and choose another.,Example(continued)

17、:This method is wasteful of random digits since most chains of 3 digits will be discarded.A more efficient strategy would be to assign each member of the population to several chains of random digits:,Random number tables,This approach leads to only chains of digits between 901 and 000 being discard

18、ed.,259 976 452 401 234 393 053 225 197 549 628 444 212 885 355 169 905 193 439 102 356 206 753 335 713 416 584 438 085 966 235 418 626 411469 807 561 925 290 692 923 229 288 631 523 040 940 642 775 838 281 475,Example(continued):Suppose that we use the 2nd line of random digits in the above table,t

19、hen the sample chosen would be:,834 234193 193439 139102 102356 56206 206753 153 335 35,713 113416 116584 284438 138085 85966(cannot be used)235 235418 118,Random number tables,259 976 452 401 234 393 053 225 197 549 628 444 212 885 355 169 905 193 439 102 356 206 753 335 713 416 584 438 085 966 235

20、 418 626 411469 807 561 925 290 692 923 229 288 631 523 040 940 642 775 838 281 475,Introduction to sampling Sampling from a normal distributionCalculating from samplesUnbiased estimatesHypothesis testing on binomial dataChocolate tasting practicalOne-sided hypothesis tests on binomial dataOne-sided

21、 versus two-sided testsCritical regions,Contents,13 of 58,Boardworks Ltd 2006,Sampling from a normal distribution,Sampling from a normal distribution,Suppose that a sample of size n is taken from a N,2 distribution and that the sample mean is.If the sampling process were to be repeated again,a diffe

22、rent sample would be extracted and a slightly different value for the sample mean would be obtained.The value of the sample mean is therefore subject to sampling variability.The sample mean therefore has a distribution,known as its sampling distribution.It is possible to show that,when a sample of s

23、ize n is drawn from a normal distribution with mean and standard deviation,the sampling distribution of the sample mean is:,Sampling from a normal distribution,Example:If a sample of size 40 is taken from a N15,24 distribution,then the sampling distribution of the sample mean is:,Notice that the var

24、iance of is.This shows that thesampling variability can be decreased by taking larger samples(i.e.,increasing the value of n).The standard deviation of is.This is usually referred to as the standard error of the sample mean.,Sampling from a normal distribution,Introduction to sampling Sampling from

25、a normal distributionCalculating from samplesUnbiased estimatesHypothesis testing on binomial dataChocolate tasting practicalOne-sided hypothesis tests on binomial dataOne-sided versus two-sided testsCritical regions,Contents,Boardworks Ltd 2006,17 of 58,Calculating from samples,Recall the formula w

26、e met earlier for finding the variance of a set of data:variance=,Sample standard deviation,These formulae are actually only normally used when we wish to calculate the variance or standard deviation using data from the entire population.,The standard deviation is the square root of this,and is some

27、times called the root mean squared deviation(rmsd):rmsd=,The variance is sometimes called the mean squared deviation(msd).,When a large population is being studied,data will only be collected for a sample.The sample data is then used make inferences about the population.Sample data may be used to es

28、timate the mean and variance of the whole population.,Sample standard deviation,But the most accurate estimate of the population variance is provided by the following formula:,This is referred to as the sample variance,with the square root being the sample standard deviation,s.,It can be shown that

29、the sample mean,gives themost accurate estimate possible of the population mean.,Example:A crisp manufacturer carries out regular monitoring of its packing machines by taking samples of 20 packets of crisps.The masses(x g)obtained in one such sample were as follows:Find the mean and the standard dev

30、iation of the masses in this sample of crisp packets.,Sample standard deviation,Note:the question clearly mentions that the data is from a sample.We will therefore use the formula for the sample standard deviation.,The sample mean is given by:,Sample standard deviation,The sample standard deviation(

31、s)is found as follows:,Introduction to sampling Sampling from a normal distributionCalculating from samplesUnbiased estimatesHypothesis testing on binomial dataChocolate tasting practicalOne-sided hypothesis tests on binomial dataOne-sided versus two-sided testsCritical regions,Contents,22 of 58,Boa

32、rdworks Ltd 2006,Unbiased estimates,A statistic is a quantity that is calculated from a sample of data.Examples include:,Introduction to estimation,We are particularly interested in finding estimates of the population mean and standard deviation.,the quartiles;the highest value.,the sample mean,;the

33、 sample variance,Note that sample variance uses n 1 instead of just n.,It can be shown that the sample mean provides an unbiased estimate of the population mean i.e.if the sampling process was carried out over and over again,the sample mean would on average produce the population mean.Likewise the s

34、ample standard deviation,s,is an unbiased estimate for the population standard deviation.,Unbiased estimates,Note that the formula gives a biased estimate of the population variance.,Example:An examiner takes a random sample of 12 of the students sitting a particular A-level examination.Their percen

35、tage marks were:55%,64%,76%,48%,73%,51%,67%,31%,55%,85%,60%,62%.Calculate unbiased estimates of the mean and the standard deviation of the marks for all students sitting the exam.,Unbiased estimates,Unbiased estimates,So the sample standard deviation,s=14.2%(to 3sf),The sample standard deviation giv

36、es an unbiased estimate of the population standard deviation:,Introduction to sampling Sampling from a normal distributionCalculating from samplesUnbiased estimatesHypothesis testing on binomial dataChocolate tasting practicalOne-sided hypothesis tests on binomial dataOne-sided versus two-sided test

37、sCritical regions,Contents,Boardworks Ltd 2006,27 of 58,Hypothesis testing on binomial data,Consider the following simple situation.You suspect that a die is biased towards the number six.In order to test this suspicion,you could perform an experiment in which the die is thrown 20 times.If the die w

38、ere fair,you would expect about 3 sixes.If you obtained a lot more than 3 sixes then you might decide that there is evidence to support your suspicions.But how do you decide on what a suspicious number of sixes is?,A simple introductory example,Consider throwing a fair dice 20 times.The probability

39、of obtaining different numbers of sixes is shown in the graph:,A simple introductory example,So,we noticed from the previous slide that,with 20 throws of a fair die,the probability of getting 7 or more sixes is about 0.0371.This means that if a fair die were thrown 20 times over and over again,then

40、you would obtain 7 or more sixes less than once in every 20 experiments.The figure of 1 in 20(or 5%)is often taken as a cut-off point results with probabilities below this level are sometimes regarded as being unlikely to have occurred by chance.However,in situations where more evidence is required,

41、cut-off values of 1%or 0.1%are typically used.,A simple introductory example,In hypothesis testing we are essentially presented with two rival hypotheses.Examples might include:,A formal introduction to hypothesis tests,These rival hypotheses are referred to as the null and the alternative hypothese

42、s.,“The coin is fair”or“the coin is biased”;“The proportion of local people in favour of a by-pass is 80%”or“the proportion is smaller than 80%”;“The drug has the same effectiveness as an existing treatment”or“the drug is more effective”.,The null hypothesis(H0)is often thought of as the cautious hy

43、pothesis it represents the usual state of affairs.The alternative hypothesis(H1)is usually the one that we suspect or hope to be true.Hypothesis testing is concerned with examining the data collected in experiments,and deciding how likely the result is to have occurred if the null hypothesis is true

44、.The significance level of the test is the chosen cut-off value between the results that might plausibly have been obtained by chance if H0 is true,and the results that are unlikely to have occurred.,A formal introduction to hypothesis tests,Significance levels that are typically used are 10%,5%,1%a

45、nd 0.1%.These significance levels correspond to different rigours of test the lower the significance level,the stronger the evidence the test will provide.,A formal introduction to hypothesis tests,Note:It is important to appreciate that it is not possible to prove that a hypothesis is definitely tr

46、ue in statistics.Hypothesis tests can only provide different degrees of evidence in support of a hypothesis.A 10%significance level can only provide weak evidence in support of a hypothesis.A 0.1%test is much more stringent and can provide very strong evidence.,Introduction to sampling Sampling from

47、 a normal distributionCalculating from samplesUnbiased estimatesHypothesis testing on binomial dataChocolate tasting practicalOne-sided hypothesis tests on binomial dataOne-sided versus two-sided testsCritical regions,Contents,Boardworks Ltd 2006,34 of 58,Chocolate tasting practical,Do you think you

48、 can taste the difference between branded chocolate and supermarket own-label chocolate?You are going to perform an experiment to find out.There will be 2 pieces of chocolate to try:one will be a branded make of chocolate,the other will be a supermarkets own-brand.Try to identify the branded make.,C

49、hocolate tasting practical,Chocolate tasting practical,Introduction to sampling Sampling from a normal distributionCalculating from samplesUnbiased estimatesHypothesis testing on binomial dataChocolate tasting practicalOne-sided hypothesis tests on binomial dataOne-sided versus two-sided testsCritic

50、al regions,Contents,Boardworks Ltd 2006,37 of 58,One-sided hypothesis tests on binomial data,Example:Mr Jones,a candidate in a local election,claims to have the support of 40%of the electorate.A rival candidate,Miss Smith,believes that Mr Jones is exaggerating his level of support.She asks a random

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 建筑/施工/环境 > 项目建议


备案号:宁ICP备20000045号-2

经营许可证:宁B2-20210002

宁公网安备 64010402000987号