《语言测试基础ppt课件.ppt》由会员分享,可在线阅读,更多相关《语言测试基础ppt课件.ppt(38页珍藏版)》请在三一办公上搜索。
1、Language Testing,Basic principles,2022/11/25,2,To understand the need for test theory, it is first necessary to understand something about the fundamental nature of,measurement ,constructs, and psychological tests.,2022/11/25,3,What are they measuring?,Height & weight,?,physical attributes,psycholog
2、ical attributes,2022/11/25,4,Unlike physical attributes, the psychological attributes of an individual cannot be measured directly as can height or weight. They are hypothetical concepts - products of the informed scientific imagination of social scientists who attempt to develop theories for explai
3、ning human behavior. The existence of such constructs can never be absolutely confirmed. The degree to which any psychological construct characterizes an individual can only be inferred from observations of his or her behavior.,What is Measurement,Measurement in the social sciences is the process of
4、 quantifying the characteristics of persons according to explicit procedures and rules.Quantification: assign numbersCharacteristics: physical and mental characteristics of persons (abilities/construct)Procedures and rules: replicable, for other observers, in other contexts and with other individual
5、s.,2022/11/25,6,A well-known classification of measurement scales is given by Stevens (1951). These measurement scales are: 1. the nominal scale On the nominal scale objects are classified according to a characteristic. An example: one can classify persons with respect to sex, hair color, etc. 2. th
6、e ordinal scale An ordinal scale comprises the numbering of different levels of an attribute that are ordered with respect to each other.Example: individuals are ranked first, second, third and so on. 3. the interval scale An interval scale is a numbering of different levels in which the distances,
7、or intervals, between the levels are equal.4. the ratio scale The ratio scale has a natural origin as well as equal intervals. Length in meters and weight in kilograms are defined on a ratio scale; so is temperature on the Kelvin scale. Ratio scales are relatively rare in psychology because of the d
8、ifficulty of defining a zero point. How would persons look like with zero intelligence?,Measurement Scales,What is Test?,Carroll (1968): a psychological or educational test is a procedure designed to elicit certain behavior from which one can make inferences about certain characteristics of an indiv
9、idual.Bachman (1990): a test is a measurement instrument designed to elicit a specific sample of an individuals behavior.,Characteristics that limit measurement (1/2),Limitations in specificationTwo levels of specification:Theoretical level Operational level,Characteristics that limit measurement (2
10、/2),Limitations in observation and quantificationindirectness IncompletenessImprecision (ordinal instead of interval)Subjectivity (test developers choice of items & subjective scoring)Relativeness (no perfect norm of language use)Thus, a major of language test development is to minimize the effects
11、of these limitations.,Classifying different types of language test (1/3),Intended UseSelectionEntranceDiagnosticPlacementAchievementResearch,Classifying different types of language test (2/3),Content Proficiency test (theory-based)Achievement test (syllabus-based)Language aptitude test,Classifying d
12、ifferent types of language test (3/3),Frame of reference Norm-referenced test (test results are interpreted with reference to the performance of a given group.) Typical norms: mean, average score, standard deviation.Criterion test (syllabus-based)Difference: Discriminativeness,Standard deviation and
13、 normal distribution,Measure of variability,Test qualities,Reliability: consistency of measurement; minimize error variance,(Brown, 1996: 189),图示,Test qualities,Construct validity: to which extent we can interpret a given test score as an indicator of the construct (ability)Reliability is a necessar
14、y but not sufficient condition for construct validity.,Five types of validity,Construct validityAn indication of how representative a test is of an underlying theory of language learning. Construct validation involves in investigation of the qualities that a test measures, thus providing a basis for
15、 the rationale of a test. Content validityDescribes how well the content of the test samples the subject matter that the course of instruction aimed to teach.Predictive validityThe degree to which predictions made from the test are confirmed by evidence gathered at some later time.Concurrent validit
16、yConcerned with the relationship between what is measured by a test (usually a newly developed test) and another existing criterion measure, which may be a well-established standardized test. If the two measures function similarly (i.e. they rank candidates in the same way), they are considered to h
17、ave concurrent validity.Face validityThe degree to which a test appears to measure the knowledge or abilities it claims to measure, as judged by an untrained observer.,Relationship between reliability and validity,In order for a test to be valid, it first needs to be reliable.Investigation of reliab
18、ility and validity can be viewed as complementary aspects of identifying, estimating, and interpreting different sources of variance in test scores.Reliability is concerned with determining how much of the variance in test scores is reliable variance, while validity is concerned with determining wha
19、t abilities contribute to this reliable variance.Although it is essential to consider both reliability and validity in the development and use of language tests, the distinction between them may not always be clear.,图示,Test qualities,Authenticity: correspondence of the characteristics of a given lan
20、guage test task to the features of a TLU task.Interactiveness: the extent and type of involvement of the test takers individual characteristics in accomplishing a test task.test takers individual characteristics: language ability, topical knowledge and affect schema.Both are relative,(Li, 1997: 175)
21、,Test qualities,Impact: impact on society, educational systems and the individuals within those systems.Micro level: individuals who are affected by a particular test.Macro level: society and educational system.Practicality: the ways in which the test will be implemented. Human resources , material
22、resources, time,Quality ControlSix qualities of usefulness,ReliabilityConstruct validityAuthenticityInteractivenessImpactPracticality,Goal: To achieve an appropriate balance among the qualities so that the overall usefulness of the test is maximized.,Describing language ability: language use in lang
23、uage tests,Communicative language ability model,Characteristics of individual language users,Communicative LanguageAbility,Defining construct,Communicative language ability Language knowledgeStrategic competence (metacognitive strategies),Bachman (1990),Comment,Provide an overall and updated picture
24、 of components of language abilityLack of acknowledgement of the interactions between components.,Strategic competence,Goal setting: deciding what one is going to doAssessment: taking stock of what is needed, what one has to work with, and how well one has done.Planning: deciding how to use what one
25、 hasComment: lack of an operational instruction, thus hard to pin down the actual behavior.,Other individual Characteristics,Personal Characteristics: age, sex, native language, education,etc.Topical knowledge: knowledge structure in long-term memory.Affective schemata: the affective or emotional co
26、rrelates of topical knowledge.,Features of communicative language tests,Reflecting components of language use competence in particular contextsAs representative as possibleClear purpose of testingThe role of context being stressedAuthenticityDirect and integrated testingCriterion-referenced Holistic
27、 and qualitative assessment of productive skillsEstablishing the theoretical and empirical validity of measuresEnhanced match between teaching, testing and reality Cyril J. Weir (1990),Overview of test development,Development of language test,Define constructs theoreticallyThere must be agreement on the theoretical definition upon which the test is based.Define constructs operationallyQuantifying observations: establish a scale,(Bachman & Palmer, 1996: 87),