基于语义组块分析的汉语语义角色标注

丁伟伟,常宝宝

PDF(973 KB)
PDF(973 KB)
中文信息学报 ›› 2009, Vol. 23 ›› Issue (5) : 53-62.
综述

基于语义组块分析的汉语语义角色标注

  • 丁伟伟,常宝宝
作者信息 +

Chinese Semantic Role Labeling Based on Semantic Chunking

  • DING Weiwei, CHANG Baobao
Author information +
History +

摘要

近些年来,中文语义角色标注得到了大家的关注,不过大多是传统的基于句法树的系统,即对句法树上的节点进行语义角色识别和分类。该文提出了一种与传统方法不同的处理策略,我们称之为基于语义组块分析的语义角色标注。在新的方法中,语义角色标注的流程不再是传统的“句法分析——语义角色识别——语义角色分类”,而是一种简化的“语义组块识别——语义组块分类”流程。这一方法将汉语语义角色标注从一个节点的分类问题转化为序列标注问题,我们使用了条件随机域这一模型,取得了较好的结果。同时由于避开了句法分析这个阶段,使得语义角色标注摆脱了对句法分析的依赖,从而突破了汉语语法分析器的时间和性能限制。通过实验我们可以看出,新的方法可以取得较高的准确率,并且大大节省了分析的时间。通过对比,我们可以发现在自动切分和词性标注上的结果与在完全正确的切分和词性标注上的结果相比,还有较大差距。

Abstract

In recent years, the Chinese SRL (semantic role labeling) has aroused the intensive attention. Many SRL systems have been built on the parsing trees, in which the constituents of the sentence structure are identified and then classified. In contrast, this paper establishes a semantic chunking based method which changes the SRL task from the traditional “parsing-semantic role identification-semantic role classification” process into a simple “semantic chunk identification-semantic chunk classification” pipeline. The semantic chunking, which is named after the syntactic chunking, is used to identify the semantic chunk, namely the arguments of the verbs. Based on the semantic chunking result, the Chinese SRL can be changed into a sequence labeling problem instead of the classification problem. We apply the conditional random fields to the problem and get better performance. Along with the removal of the parsing stage, the SRL task avoids the dependence on parsing, which is always the bottleneck both of speed and precision. The experiments have shown that the outperforms of our approach previously best-reported methods on Chinese SRL with an impressive time reduction. We also show that the proposed method works much better on gold word segmentation and POS tagging than on the automatic results.
Key words computer application; Chinese information processing; semantic role labeling; semantic chunking; conditional random fields; sequence labeling

关键词

计算机应用 / 中文信息处理 / 语义角色标注 / 语义组块分析 / 条件随机域 / 序列标注

Key words

computer application / Chinese information processing / semantic role labeling / semantic chunking / conditional random fields / sequence labeling

引用本文

导出引用
丁伟伟,常宝宝. 基于语义组块分析的汉语语义角色标注. 中文信息学报. 2009, 23(5): 53-62
DING Weiwei, CHANG Baobao. Chinese Semantic Role Labeling Based on Semantic Chunking. Journal of Chinese Information Processing. 2009, 23(5): 53-62

参考文献

[1] S. Narayanan and S. Harabagiu. Question answering based on semantic structures[C]//Proceedings of the 20th International Conference on Computational Linguistics, Geneva, Switzerland. 2004.
[2] M. Surdeanu, S. Harabagiu, J. Williams, and P. Aarseth. Using predicate-argument structures for information extraction [C]//Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Michigan. 2003.
[3] H. C. Boas. Bilingual FrameNet dictionaries for machine translation [C]//Proceedings of LREC 2002, Las Palmas, Spain. 2002.
[4] D. Gildea, D. Jurafsky. Automatic labeling of semantic roles[J]. Computational Linguistics, 2002,28(3):245-288.
[5] F.C. Baker, C.J. Fillmore, and J.B. Lowe. The Berkeley FrameNet project[C]//Proceedings of the 17th international conference on Computational linguistics, Montreal, Canada. 1998: 86-90.
[6] P. Kingsbury and M. Palmer. From TreeBank to PropBank[C]//Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC-2002), Las Palmas, Spain. 2002.
[7] Carreras X, M rques L. Introduction to the conll-2004 shared task: Semantic role labeling[C]//Proceedings of CoNLL-2004,Boston, MA, USA, 2004: 89-97.
[8] Carreras X, M rques L. Introduction to the conll-2005 shared task: Semantic role labeling[C]//Proceedings of CoNLL-2005, 2005.
[9] A. Moschitti. A Study on Convolution Kernels for Shallow Statistic Parsing[C]//Proceedings of the 42nd Meeting of the Association for Computational Linguistics, Barcelona, Spain, 2004: 335-342.
[10] S. Pradhan, K. Hacioglu, V. Krugler, W. Ward, J.H. Martin, D. Jurafsky. Support vector learning for semantic argument classification[J]. Machine Learning Journal, 2005,60(1-3),11-39.
[11] M. Zhang, W. Che, A.T. AW, C.L. Tan, G. Zhou, T. Liu, S. Li, A Grammar-driven Convolution Tree Kernel for Semantic Role Classification[C]//Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL ’07), Prague, Czech Republic, 2007.
[12] H. Sun, D. Jurafsky. Shallow Semantic Parsing of Chinese[C]//Proceedings of the HLT/NAACL, 2004.
[13] N. Xue, M. Palmer. Annotating the Propositions in the Penn Chinese Treebank[C]//Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing, Sapporo, Japan. 2003.
[14] N. Xue, M. Palmer. Automatic semantic role labeling for Chinese verbs[C]//19th International Joint Conference on Artificial Intelligence. Edinburgh, Scotland. 2005: 1160-1165.
[15] N. Xue. Semantic Role Labeling of Chinese Predicates [J]. Computational Linguistics, 2008, 34(2):225-255.
[16] 刘挺,车万翔,李生. 基于最大熵分类器的语义角色标注 [J]. 软件学报,2007, 18(3): 565-573.
[17] 于江德,樊孝忠,庞文博,余正涛. 基于条件随机场的语义角色标注 [J]. 东南大学学报,2007,23(3): 361-364.
[18] 刘怀军,车万翔,刘挺. 中文语义角色标注的特征工程 [J]. 中文信息学报, 2007,21(1): 79-84.
[19] 袁毓林. 语义角色的精细等级及其在信息处理中的应用 [J]. 中文信息学报, 2007,21(4): 10-20.
[20] K. Hacioglu and W. Ward. Target word detection and semantic role chunking using support vector machines [C]//Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. Edmonton, Canada. 2003.
[21] L. A. Ramshaw, M. P. Marcus. Text chunking using transformation-based learning [C]//Proceedings of the 3rd Workshop on Very Large Corpora. 1995.
[22] E. F. Sang, T. Kim, J. Veenstra. Representing text chunks [C]//Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong, China. 1999.
[23] K. Uchimoto, Q. Ma, M. Murata, H. Ozaku, and H. Isahara. Named Entity Extraction Based on A Maximum Entropy Model and Transformation Rules [C]//Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong, China. 2000.
[24] T. Kudo, and Y. Matsumoto. Chunking with Support Vector Machines [C]//Proceedings of Second Meeting of North American Chapter of the Association for Computational Linguistics, Pittsburgh, USA. 2001.
[25] Z. P. Jiang, J. Li, H. T. Ng. Semantic Argument Classification Exploiting Argument Interdependence [C]//Proceedings of 19th International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, 2005: 1067-1072.
[26] H. T. Ng and J. K. Low. Chinese Part-Of-Speech Tagging: One-At-A-Time Or All-At-Once? Word-Based Or Character-Based?[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Barcelona, Spain.2004.
[27] H. Duan, X. Bai, B. Chang, S. Yu. Chinese word segmentation at Peking University[C]//Proceedings of the second SIGHAN workshop on Chinese language processing. Sapporo, Japan, 2003: 152-155.
[28] V. Punyakanok, D. Roth, W. Yih. The importance of syntactic parsing and inference in semantic role labeling[J]. Computational Linguistics, 2008, 34(2): 257-287.


基金

国家自然科学基金资助项目(60303003);国家社会科学基金资助项目(06BYY048)
PDF(973 KB)

698

Accesses

0

Citation

Detail

段落导航
相关文章

/