基于规则和统计的中文自动文摘系统

傅间莲,陈群秀

PDF(282 KB)
PDF(282 KB)
中文信息学报 ›› 2006, Vol. 20 ›› Issue (5) : 12-18.

基于规则和统计的中文自动文摘系统

  • 傅间莲,陈群秀
作者信息 +

Research on Automatic Summarization Based on Rules and Statistics for Chinese Texts

  • FU Jian-lian,CHEN Qun-xiu
Author information +
History +

摘要

自动文摘是自然语言处理领域里一个重要课题,本文在传统方法基础上提出了一种中文自动文摘的方法。在篇章结构分析里,我们提出了基于连续段落相似度的主题划分算法,使生成的文摘更具内容全面性与结构平衡性。同时结合了若干规则对生成的文摘初稿进行可读性加工处理,使最终生成的文摘更具可读性。最后提出了一种新的文摘评价方法(F-new-measure)对系统进行测试。系统测试表明该方法在不同文摘压缩率时,评价值均较为稳定。

Abstract

As automatic summarization is an important research topic in the natural language processing, the paper presents an approach for Chinese text summarization on the basis of traditional methods. For text structure analysis, an algorithm is proposed formulti-topic text partitioning based on sequential paragraphic similarity, which can makes the abstract of the multi-topic article have more general content and more balanced structure. Futhermore, a series of rules are combined to enhance the readability of the output abstract. Finally, a new evaluation method is put forward. The primary test shows that its value is stable.

关键词

计算机应用 / 中文信息处理 / 自动文摘 / 向量空间模型 / 主题划分 / 可读性 / 评价

Key words

computer application / Chinese information processing / automatic summarization / vector space model / topic segmentation / readability / evaluation

引用本文

导出引用
傅间莲,陈群秀. 基于规则和统计的中文自动文摘系统. 中文信息学报. 2006, 20(5): 12-18
FU Jian-lian,CHEN Qun-xiu. Research on Automatic Summarization Based on Rules and Statistics for Chinese Texts. Journal of Chinese Information Processing. 2006, 20(5): 12-18

参考文献

[1] G. Salton. A Blueprint for Automatic Indexing[J]. SIGIR Forum, 1981, 16 (2) : 23 - 26.
[2] 傅间莲,陈群秀. 基于连续段落相似度的主题划分算法[J]. 计算机应用, 2005, (9) : 2022 - 2025.
[3] 傅间莲,陈群秀. 自动文摘系统中的主题划分问题研究[J]. 中文信息学报, 2005, 19 (6) : 28 - 35.
[4] Lin Chin-Yew and E. H. Hovy. 2002. Automated Multi-document Summarization in NeATSA. In Proceedings of the Human Language Technology Conference (HLT2002) [C]. San Diego, CA, U. S. A. , March 23 - 27, 2002.
[5] K. McKeown, J. Robin, K. Kukich, K. Generating Concise Natural Language Summaries. Information Processing & Management[J]. 1995, 31 (5) : 703 - 733.
[6] Minel J-L. , Nugier S. , Piat G. , How to Appreciate the Quality of Automatic Text Summarization[C]. In: Proc. of the ACL/EACL’97: 25 - 30.
[7] Morris A. H. , Kaspcr G. M., Adams D. A. The Effects and Limitations of Automated Text Condensing on Reading Comprehension Performance[J]. Information Systems Research 3 (1) , 1992. 17 - 35.
PDF(282 KB)

718

Accesses

0

Citation

Detail

段落导航
相关文章

/