基于多模态语义分析的试题推荐方法

王士进,汪成成,张丹,魏思,王渊

PDF(5549 KB)
PDF(5549 KB)
中文信息学报 ›› 2023, Vol. 37 ›› Issue (5) : 165-172.
跨模态自然语言处理

基于多模态语义分析的试题推荐方法

  • 王士进1,2,汪成成1,张丹1,魏思1,王渊1
作者信息 +

Question Recommendation Method Based on Multimodal Semantic Analysis

  • WANG Shijin1,2, WANG Chengcheng1, ZHANG dan1, WEI Si1, WANG Yuan1
Author information +
History +

摘要

在教育场景下,教育资源推荐是一项关键且基础的任务,教育资源呈现出显著的多源、异构和多模态特性,给教育资源的理解、应用带来了巨大的挑战。对此,该文提出了一种基于多模态语义分析的试题推荐方法: 首先进行多模态教育资源的特征抽取以及不同模态数据之间的语义关联,构建多模态教育资源的理解表示框架;并利用相同领域任务进行多模态视频和试题特征的预训练,进行关联知识建模;最后,利用线上收集的数据进行视频-试题关联特征微调,得到更加鲁棒的特征表示,进行多模态教学视频的相关性试题推荐。在教育领域数据集上的实验结果表明,该文所提出的方法能有效提升现有方法的效果,具有很好的应用价值。

Abstract

The multi-origin, diverse and multimodal nature of educational resources brings up enormous challenges for educational resources recommendation. To address this issue, this paper proposed a method that recommends questions for practicing based on multimodal semantic analysis. First, we extract the multimodal features and the semantic relationships between different modals to construct a representation structure of multimodal educational resources. Then, we model the knowledge map with an algorithm pre-trained on multimodal video features and question features. In the end, fine-tuned by pre-collected video-question features, the model can extract more robust feature representations to recommend practice questions that are highly related to the lecture videos. Experiments show that this method outperforms the current methods.

关键词

教育资源 / 多模态 / 试题推荐

Key words

educational resources / multimodal / question recommendation

引用本文

导出引用
王士进,汪成成,张丹,魏思,王渊. 基于多模态语义分析的试题推荐方法. 中文信息学报. 2023, 37(5): 165-172
WANG Shijin, WANG Chengcheng, ZHANG dan, WEI Si, WANG Yuan. Question Recommendation Method Based on Multimodal Semantic Analysis. Journal of Chinese Information Processing. 2023, 37(5): 165-172

参考文献

[1] CHUNG S W, KANG H G, JOON SON C. Seeing voices and hearing voices: Learning discriminative embeddings using cross-modal self-supervision[C]// Proceedings of the Interspeech, 2020:3486-3490.
[2] YOON S, BYUN S,JUNG K. Multimodal speech emotion recognition using audio and text[C]// Proceedings of the IEEE Spoken Language Technology Workshop, 2018: 112-118.
[3] MITTAL T,GUHAN P, BHATTACHARYA U, R. Chandra , et al. EmotiCon: Context-aware multimodal emotion recognition using frege’s principle[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 14222-14231.
[4] IASHINV,RATHU E. A better use of audio-visual cues: dense video captioning with bi-modal transformer[C]// Proceedings of the 31st British Machine Vision Virtual Conference, 2020:1-16.
[5] JIN Q,LIANG J. Video description generation using audio and visual cues[C]// Proceedings of the ACM on International Conference on Multimedia Retrieval: 239-242.
[6] SRIVASTAVA N, RUSLAN S. Multimodal learning with deep Boltzmann machines[J].The Journal of Machine Learning Research,2014,15(1): 2949-2980.
[7] RYAN K, RUSLAN S, RICH Z. Neural language models[C]//Proceedings of the 31st International Conference on Machine Learning, 2014: 595-603.
[8] 胡国平,张丹,苏喻,等. 试题知识点预测: 一种教研知识强化的卷积神经网络模型[J]. 中文信息学报, 2018, 32(5): 137-146.
[9] YINY, LIU Q, HUANG Z, et al. QuesNet: A unified representation for heterogeneous test questions[C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining,2019: 1328-1336.
[10] LIU Q, HUANG Z, HUANG Z, et al. Finding similar exercises in online education systems[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018: 1821-1830.
[11] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2014: 1746-1751.
[12] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019: 4171-4186.
[13] LI C, YATES A, MACAVANEY S, et al. PARADE: Passage representation aggregation for document reranking[J]. arXiv preprint arXiv: 2008.09093.2020.

基金

国家重点研究与发展计划(2022YFC3303504)
PDF(5549 KB)

Accesses

Citation

Detail

段落导航
相关文章

/