参考规范是指专业知识点的相关文本描述,参考规范指导下的文本生成任务要求自动生成的文本满足与参考规范的语义相关性和知识点匹配性,是自然语言处理领域中的困难问题。相关工作主要控制生成文本的情感、态度等通用性质,无法满足专业层面的复杂控制需求。为此,该文提出了基于对抗架构的专业文本生成模型(PT-GAN),采用多个独立的生成器分别生成不同知识点匹配程度的文本,各生成器均为自编码器结构,其中编码器用于提取参考规范文本的知识点语义特征,解码器用于生成文本;采用两个判别器同时对生成文本的语言规范和专业知识进行指导,其中连贯性判别器用于指导语言规范,专业性判别器用于控制专业层面属性。在多个国家级专业考试真实数据集上进行实验,结果显示该文模型在语言连贯性、与参考规范的语义相关性和知识点匹配性上均有明显提升,更符合该场景下的文本生成需求。
Abstract
Reference specifications refer to the text description of professional knowledge points, which are used to guide the text generation. In this paper, we propose a profession oriented text generation model based on adversarial architecture (PT-GAN), using several independent generators for the texts on different matching degrees of knowledge points. Each generator is an auto-encoder, where the encoder is used to extract the features of reference specifications, and the decoder is used to generate text. We use two discriminators to guide the text generation on both the linguistic norms and professional knowledge. The linguistic discriminator guides the coherence and the profession discriminator is used to control professional attributes. Experiments on national profession qualification examination datasets show that the proposed model has a significant improvement comparing with other methods on coherence, relevance with reference specifications, and on matching knowledge points.
关键词
文本生成 /
生成式对抗网络 /
自编码器 /
专业文本
{{custom_keyword}} /
Key words
text generation /
generative adversarial network /
auto-encoder /
professional text
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] 刘茂福, 齐乔松, 胡慧君. 基于卷积神经网络与篇章结构的足球新闻自动生成方法[J]. 中文信息学报, 2019, 33(4):106-113.
[2] 冯骁骋, 龚恒, 冷海涛,等. 基于抽取的高考作文生成[J]. 计算机学报, 2020, 043(002):315-325.
[3] GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014: 2672-2680.
[4] 万良, 李卓蓉. 生成式对抗网络研究进展[J]. 通信学报, 2018, 39(2):135-148.
[5] HUSZR F. How (not) to train your generative model: Scheduled sampling, likelihood, adversary?[J]. arXiv preprint arXiv:1511.05101, 2015.
[6] YU L, ZHANG W, WANG J, et al. SeqGAN: Sequence generative adversarial nets with policy gradient[C]//Proceedings of the AAAI Conference on Artificial Intelligence. San Francisco, California, USA: AAAI Press, 2017: 2852-2858.
[7] SEMENIUTA S, SEVERYN A, GELLY S. On accurate evaluation of GANs for language generation[J]. arXiv preprint arXiv:1806.04936, 2018.
[8] GUO J, LU S, CAI H, et al. Long text generation via adversarial training with leaked information[C]//Proceedings of the AAAI Conference on Artificial Intelligence. New Orleans, Louisiana, USA: AAAI Press, 2018: 5141-5148.
[9] LIN K, LI D, HE X, et al. Adversarial ranking for language generation[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, CA, USA: NIPS, 2017: 3158-3168.
[10] NIE W, NARODYTSKA N, PATEL A. RelGAN: Relational generative adversarial networks for text generation[C]//Proceedings of the 7th International Conference on Learning Representations. New Orleans, LA, USA: OpenReview.net, 2019: 1-20.
[11] KINGMA D P, WELLING M. Auto-encoding variational bayes[C]//Proceedings of 2nd International Conference on Learning Representations, Banff, AB, Canada: ICLR, 2014: 1-15.
[12] WANG K, WAN X. SentiGAN: Generating sentimental texts via mixture adversarial networks[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: ijcai.org, 2018: 4446-4452.
[13] LI J, SONG Y, ZHANG H, et al. Generating classical Chinese poems via conditional variational autoencoder and adversarial training[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: ACL, 2018: 3890-3900.
[14] MIRZA M, OSINDERO S. Conditional generative adversarial Nets[J].arXiv preprint arXiv: 1411.1784,2014.
[15] SALIMANS T, GOODFELLOW I, ZAREMBA W, et al. Improved techniques for training GANs[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: NIPS, 2016: 2234-2242.
[16] CHO K, VAN MERRINBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder—decoder for statistical machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Doha, Qatar: ACL, 2014: 1724-1734.
[17] DEVLIN J, CHANG M W,LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA: ACL, 2019: 4171-4186.
[18] LIU Y,OTT M, GOYAL N, et al. RoBERTa: A robustly optimized BERT pretraining approach[J]. arXiv preprint arXiv:1907.11692, 2019.
[19] YANG Z, DAI Z, YANG Y, et al.XLNet: Generalized autoregressive pretraining for language understanding[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems, 2019: 5753-5763.
[20] SU J. Open Chinese Language Pre-Trained Model Zoo[CP]. ZhuiyiAI, 2020.
[21] LECUN Y, BOSER B, DENKER J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural Computation, 1989, 1(4): 541-551.
[22] THEIS L, VAN DEN OORD A, BETHGE M. A note on the evaluation of generative models[C]//Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico: ICLR, 2016: 1-10.
[23] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Doha, Qatar: ACL, 2014: 1746-1751.
[24] XU J, REN X, LIN J, et al. Diversity-promoting GAN: A cross-entropy based generative adversarial network for diversified text generation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium: ACL, 2018: 3940-3949.
[25] JELINEK, F. Perplexity: A measure of the difficulty of speech recognition tasks[J]. Journal of the Acoustical Society of America, 1977, 62(S1):S63.
[26] PAPINENI K, ROUKOS S, WARD T, et al. BLEU: A method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, PA, USA: ACL, 2002: 311-318.
[27] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, 30: 5998-6008.
[28] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家重点研发计划(2018YFC0831401);山东省自然科学基金(ZR2022LZH007,ZR2018ZB0420)
{{custom_fund}}