YOU Fang,LI Juan-zi,WANG Zuo-ying
2003, 17(1): 46-53.
Corpora are important resources for knowledge acquisition in the field of natural language processing. For the purpose of sentence understanding ,we are constructing a Chinese large-scale-corpus based on semantic dependency relations. This paper introduces the tagging formalisms we adopt ,the tagging set we choose ,the tagging tool we develop ,and the method we use to guarantee the good consistency of tagging. The corpus under discussion is at a scale of 1 million words. Each sentence in the corpus ,which already had annotations of sense ,is further tagged with its semantic structure using 70 semantic-dependency-relations. The highlight of this corpus is its ability to effectively describe various relations between Chinese words. All of these profited from using < HowNet > for reference and the combination with specific use of language. The construction of this corpus can definitely provide more knowledge supports for sentence understanding ,content-based information retrieval ,and so on.