基于树形解码器的航空术语DEF自动生成

PDF(1676 KB)

中文信息学报 ›› 2024, Vol. 38 ›› Issue (6) : 24-33.

知识表示与知识获取

基于树形解码器的航空术语DEF自动生成

吕嘉,王裴岩,蔡东风,张桂平,李林娜

作者信息 +

DEF Generation for Terminologies Based on Tree-structured Decoder

LYU Jia, WANG Peiyan, CAI Dongfeng, ZHANG Guiping, LI Linna

Author information +

History +

摘要

该文研究了基于HowNet的KDML语法体系的术语DEF自动生成问题,提出一种基于树形解码器的生成方法。在编码器端输入专业术语以及其他外部信息(术语的定义、术语子词的义原等);在解码器端交替使用义原解码器和关系解码器,同时使用注意力机制关注编码器端的各种表征信息,最终得到“义原-关系-义原”形式的输出,并组合成术语对应的义原树,进而得到术语的DEF表示以辅助专业领域HowNet的构建,最终取得了首义原F₁值74.13%、总义原F₁值53.92%、总关系F₁值53.33%、总三元组F₁值30.48%的结果。

Abstract

This paper investigates the automatic generation of DEF based on KDML of HowNet, and proposes a generation method based on tree-structured decoder. The inputs of the encoder are technical terms and other external information (definition of the terms, sememes of sub-words of the terms, etc.). As for decoding, sememe decoder and role decoder are used alternately, and attention mechanism is used to capture various representation information. Finally, the output in the form of "sememe-role-sememe" is obtained, which is combined into the sememe tree corresponding to terms to finalize the DEF representation of terms in HowNet. Experimental results show that the proposed method achieves 74.13% F₁-value for the first sememe generation, 53.92% for the overall sememe generation, 53.33% for the role generation and 30.48% for the overall triple generation.

导出引用

吕嘉,王裴岩,蔡东风,张桂平,李林娜. 基于树形解码器的航空术语DEF自动生成. 中文信息学报. 2024, 38(6): 24-33

LYU Jia, WANG Peiyan, CAI Dongfeng, ZHANG Guiping, LI Linna. DEF Generation for Terminologies Based on Tree-structured Decoder. Journal of Chinese Information Processing. 2024, 38(6): 24-33

参考文献

[1] DONG Z, DONG Q. HowNet: A hybrid language and knowledge resource[C]//Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering, 2003: 820-824.
[2] NIU Y, XIE R, LIU Z, et al. Improved word representation learning with sememes[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 2049-2058.
[3] 刘群, 李素建. 基于《知网》的词汇语义相似度计算[J]. 中文计算语言学, 2002, 7(2): 59-76.
[4] 江敏, 肖诗斌, 王弘蔚, 等. 一种改进的基于《知网》的词语语义相似度计算[J]. 中文信息学报, 2008, 22(5): 84-89.
[5] LIU Q. Word similarity computing based on HowNet[J]. Computational Linguistics and Chinese Language Processing, 2002, 7(2): 59-76.
[6] DUAN X, ZHAO J, XU B. Word sense disambiguation through sememe labeling[C]//Proceedings of the IJCAI, 2007: 1594-1599.
[7] HUANG M, YE B, WANG Y, et al. New word detection for sentiment analysis[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014: 531-541.
[8] 傅继彬, 刘杰, 贾可亮, 等. 基于知网和术语相关度的本体关系抽取研究[J]. 现代图书情报技术, 2008,24 (9): 36-40.
[9] 张桂平, 刁丽娜, 王裴岩. 基于 HowNet 的航空术语语义知识库的构建[J]. 中文信息学报, 2014, 28(5): 92-101.
[10] 王思博, 王裴岩, 张桂平. 航空术语语义知识库辅助构建方法[J]. 中文信息学报, 2018, 32(12): 57-66.
[11] XIE R, YUAN X, LIU Z, et al. Lexical sememe prediction via word embeddings and matrix factorization[C]//Proceedings of the IJCAI, 2017: 4200-4206.
[12] JIN H, ZHU H, LIU Z, et al. Incorporating Chinese characters of words for lexical sememe prediction[C]//Proceedings of the ACL, 2018.
[13] 杜家驹, 岂凡超, 孙茂松, 等. 基于局部语义相关性的定义文本义原预测[J]. 中文信息学报, 34(5): 1-9.
[14] QI F, LIN Y, SUN M, et al. Cross-lingual lexical sememe prediction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018: 358-368.
[15] DONG L, LAPATA M. Language to logical form with neural attention[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016: 33-43.
[16] RABINOVICH M, STERN M,KLEIN D. Abstract syntax networks for code generation and semantic parsing[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017: 1139-1149.
[17] SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681.
[18] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[19] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv: 1409.0473, 2014.
[20] LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[J]. arXiv preprint arXiv: 1508.04025, 2015.
[21] XU K, BA J, KIROS R, et al. Show, attend and tell: Neural image caption generation with visual attention[C]//Proceedings of the International Conference on Machine Learning. PMLR, 2015: 2048-2057.
[22] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the Advances in Neural Information Processing Systems, 2017: 5998-6008.
[23] QI F, YANG C, LIU Z, et al. Openhownet: An open sememe-based lexical knowledge base[J/OL]. arXiv preprint arXiv: 1901.09957, 2019.
[24] WILLIAMS R J, ZIPSER D. A learning algorithm for continually running fully recurrent neural networks[J]. Neural Computation, 1989, 1(2): 270-280.
[25] LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[J/OL]. arXiv preprint arXiv: 1711.05101, 2017.

基金

国家自然科学基金(U1908216);辽宁省重点研发计划(2019JH2/10100020)

PDF(1676 KB)

384

Accesses

Citation

Detail

段落导航

摘要
Abstract
关键词
Key words
引用本文
参考文献
基金

选择文件类型/文献管理软件名称

选择包含的内容

摘要

Abstract

关键词

Key words

引用本文

{{custom_sec.title}}

{{custom_sec.title}}

参考文献

{{custom_fnGroup.title_cn}}

脚注

基金

Published
2024-07-15
Issue Date
2024-07-17