Journal of Chinese Information Processing

Select

Question Answering,Dialogue System and Machine Reading Comprehension

Question Generation Based on Information Features of Token Position

Dong Xiaozheng, Hong Yu, Zhu Fenhong, Yao Jianmin, Zhu Qiaoming

. 2019, 33(8): 93-100.

Abstract (532) PDF (730)

Knowledge map

Save

The question generation task aims to automatically generate one or more questions on the condition of understanding the semantics of a declarative sentence. This paper focuses on one of the sub-tasks, Point-wise Question Generation (PQG), and proposes a seq2seq PGQ model that combines attention mechanism about tokens. Among them, the token is a general summary of the potential answers for the sentences level, which is often shown as a series of consecutive terms in a declarative sentence. In terms of method implementation, the position information of the token and the semantic information of the whole sentence are integrated in the process of encoding. While in the process of decoding, the attention of token is strengthened. The experiment is carried out on the SQuAD corpus, revealing a better performance of 1.98% improvement in BLEU-4.

Select

Question Answering,Dialogue System and Machine Reading Comprehension

Automatic Sentence Completion Based on Deep Learning

CHEN Zhigang, HUA Lei, LIU Quan, YIN Kun, WEI Si, HU Guoping

. 2019, 33(8): 101-110.

Abstract (788) PDF (999)

Knowledge map

Save

This paper proposes an automatic sentence completion method by combining dependency parsing with deep neural networks. Firstly, a sequence modeling method based on syntactic information expansion is proposed, which can preserve the efficiency while employing syntactic information. On the basis of this, we use the idea of learning to rank to train the candidate answer ranking model. Secondly, aiming at the lack of details of the overall sequence modeling, an automatic sentence completion model based on multi-state information fusion of language model is proposed. Finally, a multi-source information fusion model combining sentence representation, dependency syntax, and multi-state information is designed. This paper also constructed an English sentence completion dataset. The experimental results on this dataset show that the dependency syntax expansion model achieves an absolute improvement of 11% compared with the baseline sequence modeling methods; the language model based state ranking technique achieves an absolute improvement of 9.3% compared with the baseline model; and the final multi-source information fusion model achieved the top accuracy of 76.9% on the test set.

Select

Question Answering, Dialogue System and Machine Reading Comprehension

A Question Answering System for Primary Liver Cancer Based on Knowledge Graph

CAO Mingyu, LI Qingqing, YANG Zhihao, WANG Lei, ZHANG Yin, LIN Hongfei, WANG Jian

. 2019, 33(6): 88-93.

Abstract (1309) PDF (2227)

Knowledge map

Save

The question answering (QA) system based on medical KB has important research and application significance. Aimed at the primary liver cancer common in adults, this paper extracts related knowledge triples from the medical guides and SemMedDB to construct a KB of primary liver cancer. On this basis, a pipeline QA system is implemented. Firstly the system identifies the entity from the question. Then the sentence embedding is generated by combining TFIDF and the word embedding to select the most similar problem template. Finally the system retrieves the answer from the KB according to the semantics of the template and the entity in the question. The results show that, this system can effectively answer questions about drugs, diseases and symptoms related to primary liver cancer.

Select

Question Answering,Dialogue System and Machine Reading Comprehension

A Chinese Conversation Model Using Pinyin for Dimension Reduction

WU Bangyu, ZHOU Yue, ZHAO Qunfei, ZHANG Pengzhu

. 2019, 33(5): 113-121.

Abstract (960) PDF (784)

Knowledge map

Save

Conversation is an important research field in natural language processing with wide applications. However, when training the Chinese conversation model, we have to face the problem of excessively high model complexity due to the large number of words. To deal with this issue, this paper proposes to convert the Chinese input into Pinyin and divide it into initials, finals and tones three parts, thereby reducing the number of words. Then, the Pinyin information is combined into image form using embedding method. We extract the Pinyin feature through a Fully Convolutional Network (FCN) and a bi-directional Long Short Term Memory (LSTM) network. Finally, we use a 4-layer Gated Recurrent Units (GRU) network to decode the Pinyin feature for solving the problem of long time memory, and obtain the output of the conversation model. On this basis, the attention mechanism is added in the decoding stage so that the output can correspond with the input better. In the experiment, we set up a conversation database in the medical field, and use BLEU and ROUGE_L as an evaluation indicator to test our model on the database.

Select

Question Answering,Dialogue System and Machine Reading Comprehension

Questions Intent Classification Based on Dual Channel Convolutional Neural Network

YANG Zhiming, WANG Laiqi, WANG Yong

. 2019, 33(5): 122-131.

Abstract (776) PDF (901)

Knowledge map

Save

Human-machine conversation technology has received extensive attention from the academic and industrial fields in recent years. The users question intention classification is an important key issues with direct effect on the quality of human-machine dialogue. In this paper, we propose an intent classification dual-channel Convolutional Neural Networks (ICDCNN) : we first extract semantic features by using Word2vec and Embedding layer to train the word vector ; then, two different channels are used for convolution, one for character level word vector, the other for word level word vector; thirdly, the character level word vectors (fine-grained) are combined with word level word vectors to mine deeper semantic information of natural language question; finally, with convolution kernels of different sizes, deeper abstract features inside the questions are learnt. Experimental results show that the algorithm achieves high accuracy on Chinese datasets, which has certain advantages compared to other methods.

Select

Question Answering, Dialogue System and Machine Reading Comprehension

Attribute Classification for Question-Answer Texts

JIANG Mingqi, SHEN Chenlin, LI Shoushan

. 2019, 33(4): 120-126.

Abstract (585) PDF (806)

Knowledge map

Save

Attribute classification, as an essential to the task of aspect-based sentiment classification, aims at classifying the category of attribute automatically. In contrast to the existing studies for attribute classification in news and review texts, this paper is focuses on a question-answer (QA) text pair, and a novel approach called multi-dimension textual representation is proposed. Firstly, we segment the question text of a QA text pair into sentences. Then, we leverage LSTM models to encode each sentence in question text and the whole answer text. Finally, we leverage a CNN layer to extract important information in all sentences of question text and the whole answer text. Experiments demonstrate the effectiveness of our proposed approach.

Select

Question Answering, Dialogue System and Machine Reading Comprehension

Non-native Mispronunciation Verification Using Acoustic Tonal Phone Embedding and Siamese Networks

WANG Zhenyu, XIE Yanlu, ZHANG Jinsong

. 2019, 33(4): 127-134.

Abstract (590) PDF (778)

Knowledge map

Save

With the continuous development of automatic speech recognition, the pronunciation errors verification and evaluation of second language (L2) learners has become one of the most important research topics in computer assisted pronunciation training. To deal with the lack of labeled mispronunciation speech data, a method based on acoustic phone embedding and Siamese network is proposed in this paper. A pair of acoustic phone segments with a pair-wise label is used as a system input, and speech features are mapped to high level representation through neural network to differentiate different types of phones. The Siamese network is optimized by tell whether two output embeddings are from same type of phones or not. Results show that accuracy of Siamese network based on cosine hinge loss function achieves the best accuracy of 89.93%, and accuracy of diagnosis is 89.19% in pronunciation error verification task.

Select

Question Answering, Dialogue System and Machine Reading Comprehension

Integrating Question Understanding in Neural Networks to Answer the Description Problems in Reading Comprehension

TAN Hongye, LIU Bei, WANG Yuanlong

. 2019, 33(3): 102-109.

Abstract (685) PDF (764)

Knowledge map

Save

This paper explores the solutions to the description problems in reading comprehension using QU-NNs model whose frameworks are the Embedding layer, the Encoding layer, the Interaction layer, the Prediction layer, and the answer Post-processing layer. To deal with the high degree of semantic generalization of the questions, we integrate three features of question (question type, question topic, question focus) in the Encoding layer and the Interaction layer of the model to better understand the question. Specifically, the question type is identified by a convolutional neural network, and the question topic and question focus are obtained through syntactic analysis. Further, a heuristic method is designed to identify the noise and redundant information in the answer. Experiments show that adding question features and removing redundant information increased the performance by 2%～10%.

Please choose a citation manager

Content to export