Article
ZHANG Hainan, WU Dayong, LIU Yue, CHENG Xueqi
2017, 31(4): 28-35.
Chinese NER is challenged by the implicit word boundary, lack of capitalization, and the polysemy of a single character in different words. This paper proposes a novel character-word joint encoding method in a deep learning framework for Chinese NER. It decreases the effect of improper word segmentation and sparse word dictionary in word-only embedding, while improves the results in character-only embedding of context missing. Experiments on the corpus of the Chinese Peoples' Daily Newspaper in 1998 demonstrates a good results: at least 1.6%, 8% and 3% improvements, respectively, in location, person and organization recognition tasks compared with character or word features; and 96.8%, 94.6%, 88.6% in F1, respectively, on location, person and organization recognition tasks if integrated with part of speech feature.