彭炜明,宋继华,俞士汶. 中文信息处理的词法问题——以句本位语法图解树库构建为背景[J]. 中文信息学报, 2014, 28(2): 1-7.
PENG Weiming, SONG Jihua, YU Shiwen. Lexical Issues in Chinese Information Processing:in the Background of Sentence-based Diagram Treebank Construction. , 2014, 28(2): 1-7.
Lexical Issues in Chinese Information Processing:in the Background of Sentence-based Diagram Treebank Construction
PENG Weiming1, SONG Jihua2, YU Shiwen1
1. MOE Key Laboratory of Computational Linguistics (Peking University), Institute of Computational Linguistics, Peking University, Beijing 100871, China; 2. College of Information Science and Technology, Beijing Normal University, Beijing 100875, China
Abstract:This paper compares the Sentence-based DiagramTreebank with existing lexical specification in the aspect of word segmentation unit and POStagging, revealing the disjunction between automatic lexical analysis and parsing in the current Chinese information processing.It describes the parsing strategy of some special structures such as nonce formation and idiomsin the Diagram Treebank as well as their linguistics rationale. It also explores the implementation of the Chinese word classtheories such as “For All Words,the Word-class Is Based on the Sentence” and “Referentiality” in Chinese information processing.