Machine Reading Comprehension and Text Generation
ZHANG Jiashuo, HONG Yu, TANG Jian, CHENG Meng, YAO Jianmin
.
2019, 33(9):
96-106.
Current image captioning is challenged the veracity of captions, i.e. an exact caption with tangible and specific entities is generated with a crude and monotonous captions ( e.g. “Messi takes the penalty kick” vs “a person is playing a ball.”). Focused on the identification and filling of person entities, this paper transform this task into a cloze issue with syntactic vacancy by removing the common person representation(e.g.“man”“player”) in the generated image caption. To introduce reading comprehension famework to address Who problem, this paper uses the R-Net to realize the acquisition and filling of the person name entity. In addition, we attempt to use the local and the global information to extract the person name entity, with local information indicating the source document that the image is located and the global information indicating the related documents from external links. Experiments show that the proposed method can effectively improve the quality of image caption generation and increase the BLEU by 2.93%.