语料库研究中主题词分析方法及其扩展.ppt

上传人:小飞机 文档编号:5841006 上传时间:2023-08-25 格式:PPT 页数:21 大小:616KB
返回 下载 相关 举报
语料库研究中主题词分析方法及其扩展.ppt_第1页
第1页 / 共21页
语料库研究中主题词分析方法及其扩展.ppt_第2页
第2页 / 共21页
语料库研究中主题词分析方法及其扩展.ppt_第3页
第3页 / 共21页
语料库研究中主题词分析方法及其扩展.ppt_第4页
第4页 / 共21页
语料库研究中主题词分析方法及其扩展.ppt_第5页
第5页 / 共21页
点击查看更多>>
资源描述

《语料库研究中主题词分析方法及其扩展.ppt》由会员分享,可在线阅读,更多相关《语料库研究中主题词分析方法及其扩展.ppt(21页珍藏版)》请在三一办公上搜索。

1、语料库研究中的主题词分析方法及其扩展,中国外语教育研究中心 梁茂成,An extension tothe keyword approach in corpus analysis,主要内容,KeywordsApplications of corpus comparisonLimitations to the keyword approachKeywords+Demo,Keywords,Keywords:Keywords are words whose frequency is unusually high(or low)in comparison with some norm.(Scott,20

2、03),Keywords,Positive keywords:Words which occur more often than would be expected by chance in comparison with the reference corpus.,Keywords,Negative keywords:Words which occur less often than would be expected by chance in comparison with the reference corpus.,Keywords,Positive and negative keywo

3、rdsIn a corpus of business English,words such as business,profit and companies are likely to be positive keywords if the corpus is to be compared with a general corpus.,Keywords,Positive and negative keywordsIn a corpus of academic English,words such as morning,afternoon and evening are likely to be

4、 negative keywords if the corpus is to be compared with a general corpus.,Keywords,Calculating keyness(Rayson et al.2004,Oakes 1998)Chi-square,Keywords,Chi-square,Keywords,Chi-square with Yates correction,Keywords,Loglikelihood,Keywords,Previous research has revealed that loglikelihood is a better m

5、easure than chi-square when comparing word frequencies in corpora.,Keywords,Ways to find keywords:Top-down:corpus-basedButtom-up:corpus-driven,Applicatons of,Comparison across usersComparison across genresComparison across timesComparison across(varieties of)languages,Applicatons of,Compiling a spec

6、ialized dictionaryDetecting the topicGenre analysisContrastive Interlanguage Analysis,Limitations to,Keywords:Do keywords have to be single words?Phraseology seems more interesting!Do keywords have to be lexical words?POS tag sequences may also be interesting.Can we bring together the bottom-up appr

7、oach and the top-down approach?,Limitations to,Top-down:the problem is I do not yet know what may be interesting.,Limitations to,Buttom-up:the problem is that I have been given a long list of keywords,only some of which are interesting,buried among many others which do not seem interesting at all.,Keywords+,Support multiword sequencesSupport online searchSupport POS tag sequencesSupport regex search,Demo,demo,Thank you.,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 生活休闲 > 在线阅读


备案号:宁ICP备20000045号-2

经营许可证:宁B2-20210002

宁公网安备 64010402000987号