ZHAI Haijun1, GUO Jiafeng2, WANG Xiaolei2, XU Hongbo2
1. University of Science & Technology of China, Hefei, Anhui 230027, China; 2. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
Abstract:Mining named entities from query logs is an important research field in data mining. Previous work proposed a seed-based framework to mine named entities from query logs by leveraging distribution similarity, which works well only when each named entity only belongs to a signle semantic class. In fact, named entities may often belong to multiple classes. In this paper, we introduce a weakly-supervised topic model to resolve class ambiguity of named entities by leveraging weak supervision from human. The experiment results show that our approach significantly outperforms the previous method. Key wordscomputer application;Chinese information processing;named entity;query log;topic model
[1] Borthwick Andrew,Sterling J.,Agichtein E,Grishman R.. NYU: Description of the MENE Named Entity System as used in MUC-7[C]//Proc. Seventh Message Understanding Conference. 1998. [2] Cucchiarelli Alessandro,Velardi P. Unsupervised Named Entity Recognition Using Syntactic and Semantic Contextual Evidence[J]. Computational Linguistics,2001,27(1): 123-131。 [3] Evans Richard. A Framework for Named Entity Recognition in the Open Domain[C]// Proc. Recent Advances in Natural Language Processing. 2003.