张龙凯1,王厚峰1. 文本摘要问题中的句子抽取方法研究[J]. 中文信息学报, 2012, 26(2): 97-102.
ZHANG Longkai,WANG Houfeng. Research on Sentence Extraction in Text Summarization. , 2012, 26(2): 97-102.
文本摘要问题中的句子抽取方法研究
张龙凯1,王厚峰1
1. 北京大学 计算语言学教育部重点实验室,北京 100871
Research on Sentence Extraction in Text Summarization
ZHANG Longkai,WANG Houfeng
Key Laboratory of Computational Linguistics (Ministry of Education), Peking University, Beijing 100871, China
Abstract:Extractive summarization attempts to extract important sentences from the original text and re-organize them in a summary. In this paper we propose a method to automatically identify significant sentences. The basic idea of this method is to label each sentence with one of two tags via the sequence labeling modelof Conditional Random Fields. Considering that many sentences tend to be rejected due to the fact that sentences in summarization are much less compared with the original sentences, we introduce a correction factor to correct the label bias. Experiment results show that the proposed method achieves a good performance. Key wordstext summarization;sentence extraction;CRF
[1] Dipanjan Das,Andre F.T.Martins. A survey on Automatic Text Summarization. Literature Survey for the Language and Statistics II[J]. 2007. [2] Luhn, H. P. The automatic creation of literature abstracts[J]. IBM Journal of Research Development, 1958,2(2):159-165. [3] Baxendale P. Machine-made index for technical literature-an experiment[J]. IBM Journal of Research Development, 1958,2(4):354-361. [4] Edmundson H. P. New methods in automatic extracting[J]. Journal of the ACM, 16(2):264-285. 1999 [5] Kupiec J., Pedersen J., Chen, F. A trainable document summarizer[C]//Proceedings of SIGIR 95, 68-73, New York, NY, USA. 1995. [6] Aone C., Okurowski M. E., Gorlinsky J., et al. A trainable summarizer with knowledge acquired from robust nlp techniques[J]. In Mani, I. and Maybury, M. T., editors, Advances in Automatic Text Summarization, 71-80. MIT Press.1999. [7] Lin, C.-Y. Training a selection function for extraction[C]//Proceedings of CIKM 99, New York, NY, USA. 1999: 55-62. [8] Conroy J. M., Oleary D. P. Text summarization via hidden markov models[C]//Proceedings of SIGIR 01, New York, NY, USA. 2001:406-407. [9] D. Shen, J.T. Sun, H. Li, et al. Document Summarization using Conditional Random Fields[C]//Proceedings of IJCAI, 2007:1805-1813. [10] Kschischang Frank, Frey Brendan J., Loeliger. Hans-Andrea: Factor Graphs and the Sum-Product Algorithm[J]. IEEE Transactions on Information Theory 47 (2001), No. 2, 2001: 498-519.