《论文(设计)基于多层次信息的连续手写中文的自适应分割方法24786.doc》由会员分享,可在线阅读,更多相关《论文(设计)基于多层次信息的连续手写中文的自适应分割方法24786.doc(9页珍藏版)》请在三一办公上搜索。
1、基于多层次信息的连续手写中文的自适应分割方法Adaptive Character Extraction from Continuous Handwriting Chinese TextBased on Multilevel Constrains张习文,高秀娟,戴国忠Zhang Xiwen,Gao Xiujuan,Dai Guozhong中国科学院软件研究所,人机交互技术与智能信息处理实验室,北京,100080Laboratory of Human-Computer Interaction and Intelligent Information Processing,Institute of S
2、oftware, the Chinese Academy of Sciences, Beijing 100080摘要:单字提取是连续手写中文识别的前提。本文给出了一种基于多层次信息的自适应单字提取方法。以候选单字个数与字宽度方差之比为处理满意度。以行笔划为处理单元,先根据候选单字最小包围矩形的水平间距构建多层次树表示,然后对最满意层中的每个候选单字进行多层次分析和自适应处理。如果候选单字的宽度大于或小于字宽度的较大值或较小值,则遍历其下层子节点或上层父节点,进行候选单字的分裂或合并,选择提高满意度的候选单字,同时更新字宽度阈值。测试结果表明该方法对连续手写中文具有较好的分割效果。Abstrac
3、t It is prerequisite to extract character from the continuous handwriting Chinese text for its recognition. The paper proposes a novel approach to adaptively extracting character from the continuous handwriting Chinese text based on multilevel constrains. It aims to extract more characters with smal
4、ler character width standard deviation. The segmentation is feed into strokes by line. A tree is constructed to represent the multilevel combination of a line of strokes according to gaps between strokes or candidate characters. The candidate characters shared the same level, with the most satisfact
5、ory candidate characters, are refined to be merged or split under constrains of candidate characters of their lower levels or upper levels in the stroke tree. If one candidate characters width exceeds or is less than the bigger character width threshold or the smaller one, the candidate character wi
6、ll be split or merged. The candidate characters are identified as the correct ones if they increase the satisfaction of the segmentation result. The character width thresholds are updated together with character extraction. Many applications show that the approach is effective and robust for charact
7、er extraction from continuous handwriting Chinese text.关键词:连续手写中文,单字提取,树表示Keywords continuous handwriting Chinese text, character extraction, tree representation中图法分类号:TP391作者简介:张习文,生于1971年,男,辽宁大连人,副研究员,主要研究方向为连续手写中文处理、多通道融合、模式识别等。通讯地址:北京市海淀区中关村南四街四号,中科院软件所人机交互技术与智能信息处理实验室4号楼305室邮编:100080联系电话:010-62
8、540434E-mail: zxwiel_高秀娟,生于1977年,女,河北遵化人,实习研究员,主要研究方向为笔交互、模式识别、人工智能等。戴国忠,生于1944年,男,江苏无锡人,研究员,博士生导师,主要研究领域为人机交互技术,计算机图形学等。1 引言就文本输入计算机而言,手写输入比键盘输入更符合人的纸笔写作习惯,更能保证自然、流畅的连续书写方式。电子笔等手写设备日趋成熟1,已经积累了大量亟待识别的手写字符。单字提取是连续手写中文识别不可逾越的必要前提。根据错误提取的单字并不能够获得正确的单字识别结果。单字识别错误可以通过识别结果上下文处理2,3得以自动校正,却无法修正单字提取错误。因此,为了获
9、得更好的连续手写中文识别,单字提取必须具有很高的正确率。汉字可以分解为偏旁部首,而偏旁部首又可以分解为笔划。在构成偏旁部首时,笔划具有多种组成关系,例如,孤立关系,交叉关系,相交关系,相连关系等。在构成汉字时,偏旁部首也有多种组成关系,例如,上下关系,上中下关系,左右关系,左中右关系,半包围关系,全包围关系等。在手写汉字中,笔划、偏旁部首存在一定的随意性,字宽度和字间距都会有所变化,单字的笔划、偏旁部首可能离得较远,而邻接汉字则可能离得较近。中文不仅包括复杂的汉字,还包括标点、符号、数字、字母、单词等。这些都给手写中文分割带来了很大困难。现有单字提取方法对笔划多层次信息的利用还远远不够,使得处
10、理结果难尽人意。一行笔划在字宽度、字间距上分别具有较高的一致性。因此,本文以行笔划为处理单元。行笔划可以根据候选单字间距构建多层次的树表示,单字提取与同层邻接候选单字、上下层相关候选单字都有关联。笔划树为单字提取提供了多层次信息。因而,针对连续手写中文分割,本文提出了一种基于多层次信息的自适应单字提取方法。2相关工作回顾连续手写中文是由手写笔划组成的。一个手写笔划可能包含多个汉字笔划。手写笔划是指手写笔从落下到抬起所记录的点坐标和其它信息。同汉字相比,标点、符号、数字、字母包含很少的笔划,结构简单。日文、朝鲜文虽然与汉字有较多相似之处,都是多笔划结构,但数量较少,结构较简单。根据利用的信息,现
11、有单字提取方法(包括汉字、日文、朝鲜文、单词、字母、数字等)可以分为三种:(1)基于候选单字间距的方法C. Hong等4先采用若干字间距阈值进行连续手写中文分割,获得多个分割结果,然后根据字间距方差从中选取最佳两组结果,在不提高字间距方差的前提下,合并邻近的候选单字,分裂较宽的候选单字,最后利用识别结果提取单字。候选单字间距是最小包围矩形的水平距离。Lin Yu Tseng等5也采用了最小包围矩形计算字间距,先根据汉字结构知识初步合并笔划,最后利用动态规划方法进一步合并候选单字。该方法能够处理多数情况下的重叠、粘连单字,但有时难以正确提取偏旁部首距离较远的单字、离得较近的邻接单字。赵宇明等6也
12、采用了最小包围矩形计算字间距,根据汉字笔划的结构知识逐步合并笔划,从而提取单个汉字。该方法也可以部分地解决粘连汉字的单字提取问题。后两种方法设置了较多经验阈值,例如,字宽度阈值,两个最小包围矩形重叠部分与较小最小包围矩形面积之比的阈值,因而自适应性较低。(2)基于候选单字时间间隔和空间距离融合信息的方法Patrick Chiu等7为构建多行笔划的多层次树表示提出了笔划距离,它融合了笔划的时间间隔和空间距离(包括x、y两个方向的距离)。该方法逐步合并距离最近的候选单字,形成树的不同层。该文处理日文和数字,只是给出了笔划的树表示,却没有涉及如何从中自动提取单字(数字、日文)。(3)基于识别结果的方
13、法C. Hong等4先根据候选单字间距提取单字,然后再加上候选单字识别结果构建候选单字网格,最后根据候选单字识别得分、语言模型得分从候选单字网格中搜索最佳路径,获取单字提取结果。该文并没有给出语言模型得分计算方法和候选单字搜索方法。上述第三种方法在单字提取中引入了候选单字识别结果信息,利用了候选单字识别得分和语言模型得分,而这要求识别器、语言模型具有很高的性能,单字识别错误、句子理解误差都会造成单字提取错误。该方法虽然利用了多个层次信息进行单字提取,但并不充分,只是构建了五个层次,对自适应性考虑得也较少。其余方法只是利用了单层次信息进行单字提取。由于汉字结构的复杂性、中文手写的随意性,仅根据单
14、层次信息难以判定单字提取结果的正误,还必须综合多层次信息。因此,本文提出了基于多层次信息的自适应单字提取方法。在单字提取中,将行笔划构建为多层次树表示,单字提取不仅与同层邻接候选单字有关,而且与上下层相关候选单字也有关,从而较大地提高了单字提取的正确率。3基于候选单字间距构建行笔划的多层次树表示时间上较近的笔划在空间上也较近。而空间上较近的笔划在时间上不一定较近。单字是要求其笔划在空间上较近的,而不必是时间上较近。但笔划空间较近则隐含了时间较近。因此,本文只利用候选单字空间间距进行单字提取。如果某个笔划与下一个笔划的水平间距很大,接近于已有笔划的宽度,则该笔划为当前行的最后一个笔划,从而可以提
15、取该行笔划。构建行笔划树表示是根据候选单字间距进行的。根据单字的空间表示方法,单字(笔划)间距计算方法可以分为4种8:(1)单字最小包围矩形之间的水平距离,(2)单字凸包之间的距离,(3)单字笔划之间的水平距离,(4)单字笔划之间的距离。本文根据候选单字最小包围矩形的水平间距构建行笔划的树表示,该间距具有较好的单字提取效果,3.2节给出了选择依据。3.1构建行笔划的树表示笔划树的初始层是由笔划构成的,是树的叶子节点。笔划树是自下而上构建的。笔划树的新一层是根据最高层的最小字间距构建的。合并字间距不大于的邻接候选单字,生成笔划树的新节点,形成笔划树的新一层。重复上述过程,直到最高层只有一个候选单
16、字为止。该算法的具体步骤如下所示。步骤1. 每个笔划作为一个候选单字,构建笔划树的初始层。步骤2. 如果笔划树最高层只有一个候选单字,则转到步骤7。步骤3. 计算笔划树最高层的最小字间距。步骤4. 取出笔划树最高层的候选单字i,以候选单字i生成笔划树节点,的层索引为笔划树的总层数。步骤5. While(与的字间距不大于)合并进,增加的子节点索引,并设定该子节点的父节点索引。i=i+1。 步骤6. 返回步骤3。步骤7. 结束行笔划树表示的构建。图1.a为一行连续手写中文,包括汉字、标点。图1.b为该行笔划的多层次树表示。a 一行连续手写中文b 行笔划的多层次树表示 c 单字提取过程 d待分裂子节
17、点及其重组结果 e单字提取结果图1基于笔划树的单字提取行笔划树包含了不同字间距的候选单字提取结果,也包含了邻接层候选单字之间的关联。根据笔划树可以进行自下而上的层次关联,获得从笔划、偏旁部首到候选单字的合并;反之,也可以进行自上而下的层次关联,获得从候选单字到偏旁部首、笔划的拆分。3.2字间距计算方法的选择字间距计算方法直接影响单字提取的质量和速度。如果笔划树中不存在正确的单字,则仅依靠树遍历是不能提取正确的单字。字间距计算方法决定了笔划树的候选单字总数和正确单字数,相同的正确单字计为一个。如果笔划树具有较少的候选单字和较多的正确单字,则表明所采用的字间距计算方法具有较好的性能。因而,字间距计
18、算方法优先级。大量实验数据表明单字最小包围矩形水平距离的字间距计算方法能够为本文所提出的单字提取方法提供最好的树表示。表1给出了采用前述四种不同字间距计算方法构建图1.a笔划树的性能比较。表1 四种字间距计算方法构建笔划树的性能比较候选单字总数正确单字总数优先级11290.0809370.07514290.06314080.0574基于笔划树的自适应单字提取方法在笔划树中,同一层、相邻层的候选单字相互关联,这为基于多层次信息的单字提取提供了良好的环境支持。如果笔划树的某层具有较多的候选单字,而且字宽度方差也较小,则将该层作为初始的候选单字提取结果。笔划树最低层是以原始笔划为候选单字,具有最多的
19、候选单字,最高层只有一个候选单字,并不存在字宽度方差,这两层都不可能成为树最佳层,因此不予以考虑。以候选单字个数与字宽度方差之比为单字提取结果的处理满意度。笔划树中具有最大的层设为候选单字提取的最佳层。对笔划树的第层中的每个候选单字进行多层次分析和自适应处理。字宽度可以分为三类:较小值、正常值、较大值,根据字宽度的中值来确定。对单字提取结果的字宽度进行由小到大的排序,从小于中值的字宽度中计算中值作为字宽度的较小值,从大于中值的字宽度中计算中值作为字宽度的较大值。位于较小值和较大值之间的字宽度为正常值。具有正常值的候选单字被认为是正确单字。对大于字宽度较大值的候选单字则遍历笔划树中其下层子节点,
20、进行分裂处理。而对小于字宽度较小值的候选单字则遍历笔划树中其上层父节点,进行合并处理,但不与已标记为正确的单字进行合并。在进行候选单字的分裂或合并时,选择提高满意度的候选单字,同时更新字宽度阈值。最后获得具有最大满意度的单字提取结果。基于笔划树的自适应单字提取算法的具体步骤如下。步骤1. 计算笔划树最佳层候选单字的字宽度较大值、较小值、满意度。步骤2. 取出笔划树最佳层中的树节点。步骤3. 如果树节点的字宽度为正常值,则该候选单字为正确单字,返回步骤2。步骤4. 如果树节点的字宽度小于,则取出其上层父节点(没有合并正确单字),直到满意度不再提高为止,以最后的父节点为单字提取结果,更新、,i=i
21、+1,返回步骤2。步骤5. 如果树节点的字宽度大于,则取出其下层子节点的重组结果,直到满意度不再提高为止,以最后子节点的重组结果为单字提取结果,更新、,i=i+1,返回步骤2。步骤6. 结束单字提取,获得具有最大满意度的单字提取结果。子节点重组是从左到右依次进行的。第3层第2、5、9个候选单字的子节点及其重组结果如图1.d所示。对每个待分裂候选单字选择具有最大满意度的子节点重组结果。第2、5个候选单字并没有进行分裂。第9个候选单字的分裂为两个新的候选单字,为第3个子节点重组结果,前一个单字为第1个子节点,而后一个单字为第2、3子节点的组合。图1.c为图1.b所示笔划树的单字提取过程。行笔划树的
22、第3层为最佳层。最佳层的第1、7、8个候选单字的字宽度为正常值,采用虚线最小包围矩形表示。最佳层的第3、4、6、10个候选单字的字宽度小于字宽度的较小值,其中第3、4个候选单字合并为第4层的第3个单字,第6、10个候选单字并没有进行合并,分别确认为第7层第3个、第8层第4个单字。最佳层的第2、5、9个候选单字的字宽度大于字宽度的较大阈值,进行子节点重组处理,第2、5个候选单字并没有进行分裂,第9个候选单字的子节点重组为两个新的单字。图1.e为图1.a的单字提取结果,正确提取了全部的10个单字。5性能评析基于上述所提出的方法,作者采用VC+开发了一个软件原型系统。该原型系统运行于装有Window
23、s 2000的PC上。下面根据大量连续手写中文的分割结果及其定量分析给出本文所提出方法的性能评析。5.1实验结果连续手写中文是采用北京中文之星数码科技有限公司的声位笔进行手写输入的9。该笔的空间分辨率是100dpi,书写采样速度是60点/秒。图2是多行连续手写中文,采用矩形包围框表示提取的单字,单字提取正确率为100%。待添加的隐藏文字内容3图2连续手写中文的单字提取结果在原型系统上对大量连续手写中文进行了单字提取测试。表2给出了部分处理结果,包括单字提取的正确率、欠合并率、过合并率、处理速度。处理速度是在具有CPU 1.4GHz、RAM 192M的PC上测试的。表2 自适应单字提取方法的性能
24、单字数正确率处理速度(字/秒)欠合并率过合并率1(图1)10100%1000%0%2(图2)20100%1020%0%32494%984%2%42795%1053%2%53392%976%2%5.2实验结果评析较大的正确率、较小的欠合并率、较小的过合并率表示较好的单字提取质量,较小的处理速度表示较好的单字提取效率。在表2中,最低正确率是92%,最高欠合并率是6%,最高过合并率是2%,这表明了本文所提出的方法具有较好的单字提取质量。单字提取速度是每秒100个字,一张A4纸上通常可以写下1000个字,用10秒钟即可处理完毕。根据实验结果评价,本文所提出方法之所以具有较高的单字提取正确率主要是因为其
25、具有以下三个处理策略:(1)采用了行笔划的多层次树表示,为正确单字提取提供了充分的候选单字。(2)在提取单字时,不仅利用了同一层邻接候选单字的信息,而且也利用了上下层相关候选单字的信息,具有很强的自适应性。(3)不必使用单字识别结果,降低了计算复杂性以及单字识别误差、句子理解偏差的不利影响。6结束语本文给出了一种基于多层次信息的连续手写中文的自适应分割方法。该方法以行笔划为处理单元,字间距、字宽度的局部一致性更有保证,具有较好的适应性和健壮性。根据字间距逐步构建笔划树的各个层,使得笔划树充分涵盖了更多的正确单字。遍历笔划树提取单字,利用了多个层次信息,显著提高了单字提取的正确率。测试结果分析表
26、明,作者所提出方法是有效的、健壮的,能够较好地实现连续手写中文分割,较大地提高了单字提取的正确率。该方法还应进一步改善,减少反馈计算,提高单字提取的质量和速度。在该方法处理结果基础上,结合识别结果修改单字提取结果会取得更好的效果,这部分工作正在顺利进行之中。致谢本文得到了国家自然科学基金(60033020)、863项目(2001AA114170) 和973项目(2002CB312103)的资助,在此表示感谢。参考文献1 L. Schomaker. From handwriting analysis to pen-computer applicationsJ. Electronics & Com
27、munication Engineering Journal, 1998, 6: 94 102.2 徐志明, 王晓龙, 张凯, 关毅. 联机手写体汉字识别后处理技术的研究J. 计算机研究与发展,1999,36(5):608 612.3 李元祥, 丁晓青, 吴佑寿. 一种基于字词结合的汉字识别上下文处理新方法J. 计算机研究与发展,2002,39(7):838 842.4 C. Hong, G. Loudon, Y. Wu, and R. Zitserman. Segmentation and recognition of continuous handwriting Chinese textA
28、. In International Conference on Computer Processing of Oriental Languages, 1997: 630 633.5 Lin Yu Tseng, Rung Ching Chen. Segmenting handwritten Chinese characters based on heuristic merging of stroke bounding boxes and dynamic programmingJ. Pattern Recognition Letters, 1998, 19 (8): 963 973.6 赵宇明,
29、 江兴智, 施鹏飞. 基于笔划提取和合并的离线手写体汉字字符切分算法J. 红外与激光工程,2002,31(1):23 27.7 Patrick Chiu, Lynn Wilcox. A Dynamic grouping technique for ink and audio notesA. In Proceedings of the 11th Annual ACM Symposium on User Interface and Technology, San Francisco, CA, Nov. 1 4, E. Mynatt and R. J. K. Jacob, Eds. ACM Pres
30、s, New York, 1998: 195 202.8 Soo H. Kim, S. Jeong, Guee-Sang Lee, Ching Y. Suen. Word Segmentation in Handwritten Korean Text Lines Based On Gap Clustering TechniquesA. Proc. 6th International Conference on Document Analysis and Recognition3, Seattle, USA, Sep. 2001: 189 19.9 北京中文之星数码科技有限公司,Editors
31、note: Judson Jones is a meteorologist, journalist and photographer. He has freelanced with CNN for four years, covering severe weather from tornadoes to typhoons. Follow him on Twitter: jnjonesjr (CNN) - I will always wonder what it was like to huddle around a shortwave radio and through the crackli
32、ng static from space hear the faint beeps of the worlds first satellite - Sputnik. I also missed watching Neil Armstrong step foot on the moon and the first space shuttle take off for the stars. Those events were way before my time.As a kid, I was fascinated with what goes on in the sky, and when NA
33、SA pulled the plug on the shuttle program I was heartbroken. Yet the privatized space race has renewed my childhood dreams to reach for the stars.As a meteorologist, Ive still seen many important weather and space events, but right now, if you were sitting next to me, youd hear my foot tapping rapid
34、ly under my desk. Im anxious for the next one: a space capsule hanging from a crane in the New Mexico desert.Its like the set for a George Lucas movie floating to the edge of space.You and I will have the chance to watch a man take a leap into an unimaginable free fall from the edge of space - live.
35、The (lack of) air up there Watch man jump from 96,000 feet Tuesday, I sat at work glued to the live stream of the Red Bull Stratos Mission. I watched the balloons positioned at different altitudes in the sky to test the winds, knowing that if they would just line up in a vertical straight line we wo
36、uld be go for launch.I feel this mission was created for me because I am also a journalist and a photographer, but above all I live for taking a leap of faith - the feeling of pushing the envelope into uncharted territory.The guy who is going to do this, Felix Baumgartner, must have that same feelin
37、g, at a level I will never reach. However, it did not stop me from feeling his pain when a gust of swirling wind kicked up and twisted the partially filled balloon that would take him to the upper end of our atmosphere. As soon as the 40-acre balloon, with skin no thicker than a dry cleaning bag, sc
38、raped the ground I knew it was over.How claustrophobia almost grounded supersonic skydiverWith each twist, you could see the wrinkles of disappointment on the face of the current record holder and capcom (capsule communications), Col. Joe Kittinger. He hung his head low in mission control as he told
39、 Baumgartner the disappointing news: Mission aborted.The supersonic descent could happen as early as Sunday.The weather plays an important role in this mission. Starting at the ground, conditions have to be very calm - winds less than 2 mph, with no precipitation or humidity and limited cloud cover.
40、 The balloon, with capsule attached, will move through the lower level of the atmosphere (the troposphere) where our day-to-day weather lives. It will climb higher than the tip of Mount Everest (5.5 miles/8.85 kilometers), drifting even higher than the cruising altitude of commercial airliners (5.6
41、miles/9.17 kilometers) and into the stratosphere. As he crosses the boundary layer (called the tropopause), he can expect a lot of turbulence.The balloon will slowly drift to the edge of space at 120,000 feet (22.7 miles/36.53 kilometers). Here, Fearless Felix will unclip. He will roll back the door
42、.Then, I would assume, he will slowly step out onto something resembling an Olympic diving platform.Below, the Earth becomes the concrete bottom of a swimming pool that he wants to land on, but not too hard. Still, hell be traveling fast, so despite the distance, it will not be like diving into the
43、deep end of a pool. It will be like he is diving into the shallow end.Skydiver preps for the big jumpWhen he jumps, he is expected to reach the speed of sound - 690 mph (1,110 kph) - in less than 40 seconds. Like hitting the top of the water, he will begin to slow as he approaches the more dense air
44、 closer to Earth. But this will not be enough to stop him completely.If he goes too fast or spins out of control, he has a stabilization parachute that can be deployed to slow him down. His team hopes its not needed. Instead, he plans to deploy his 270-square-foot (25-square-meter) main chute at an
45、altitude of around 5,000 feet (1,524 meters).In order to deploy this chute successfully, he will have to slow to 172 mph (277 kph). He will have a reserve parachute that will open automatically if he loses consciousness at mach speeds.Even if everything goes as planned, it wont. Baumgartner still wi
46、ll free fall at a speed that would cause you and me to pass out, and no parachute is guaranteed to work higher than 25,000 feet (7,620 meters).It might not be the moon, but Kittinger free fell from 102,800 feet in 1960 - at the dawn of an infamous space race that captured the hearts of many. Baumgartner will attempt to break that record, a feat that boggles the mind. This is one of those monumental moments I will always remember, because there is no way Id miss this.10