Journal of Chinese Information Processing

Select

Survey

A Survey of Language Model Based Pre-training Technology

YUE Zengying, YE Xia, LIU Ruiheng

. 2021, 35(9): 15-29.

Abstract (1168) PDF (5768)

Knowledge map

Save

Pre-training technology has stepped into the center stage of natural language processing, especially with the emergence of ELMo, GTP, BERT, XLNet, T5, and GTP-3 in the last two years. In this paper, we analyze and classify the existing pre-training technologies from four aspects: language model, feature extractor, contextual representation, and word representation. We discuss the main issues and development trends of pre-training technologies in current natural language processing.

Select

Survey

A Survey of Multimodal Information Processing Frontiers: Application, Fusion and Pre-training

WU Youzheng, LI Haoran, YAO Ting, HE Xiaodong

. 2022, 36(5): 1-20.

Abstract (2937) PDF (5374)

Knowledge map

Save

Over the past decade, there has been a steady momentum of innovation and breakthroughs that convincingly push the limits of modeling single modality, e.g., vision, speech and language. Going beyond such research progresses made in single modality, the rise of multimodal social network, short video applications, video conferencing, live video streaming and digital human highly demands the development of multimodal intelligence and offers a fertile ground for multimodal analysis. This paper reviews recent multimodal applications that have attracted intensive attention in the field of natural language processing, and summarizes the mainstream multimodal fusion approaches from the perspectives of single modal representation, multimodal fusion stage, fusion network, fusion of unaligned modalities, and fusion of missing modalities. In addition, this paper elaborate the latest progresses of the vision-language pre-training.

Select

Review

A SVD Method in Bilingual Information Filtering

Haiming Lu1 , Jinhui Xu2, Zengxiang Lu1 , Yanda Li1

. 1999, 13(3): 19-26.

Abstract (444) PDF (3318)

Knowledge map

Save

This paper introduces a SVD method in bilingual information filtering. It gives an uniform presentation to bilingual documents. Then any arithmetic used in monolingual information filtering can be easily used in bilingual information filtering. Using this method , we can compress the document vector and filter the noise. This method is used in personal information filtering on the Internet . We provide the WWW Bookmark Service. Through user's Bookmark , we can get user's preference and recommend interesting bilingual documents. According to user's feedback , we can improve the quality of information filtering.

Select

Survey

A Survey to Knowledge Graph and Its Military Application

LIN Wangqun, WANG Miao, WANG Wei, WANG Chongnan, JIN Songchang

. 2020, 34(12): 9-16.

Abstract (2236) PDF (5517)

Knowledge map

Save

Knowledge graph describes the concept, entity and their relationship in the form of semantic network. In this paper, we formally describe the basic concepts and the hierarchical architecture of knowledge graph. Then we review the state-of-the-art technologies of information extraction, knowledge fusion, schema, knowledge management. Finally, we probes into the application of knowledge graph in the military field, revealing challenges and trends of the future development.

Select

Information Extraction and Text Mining

Named Entity Recognition in Traditional Chinese Medicine Books Combining Semi-supervised Learning and Rule-based Approach

BAO Zhenshan, SONG Bingyan, ZHANG Wenbo, SUN Chao

. 2022, 36(6): 90-100.

Abstract (521) PDF (2534)

Knowledge map

Save

The named entity recognition of traditional Chinese medicine books is a less addressed topic. Considering the difficulty and cost in annotating such professional text in classical Chinese, this paper proposes a method for identifying traditional Chinese medicine entities based on a combination of semi-supervised learning and rules. Under the framework of the conditional random fields model, supervised features such as lexical features and dictionary features are introduced together with the unsupervised semantic features derived from word vectors. The optimal semi-supervised learning model is gained by examining the performance of different feature combinations. Finally, the recognition results of the model are analyzed and a rule based post-processing is established with the linguistic characteristics of ancient books. Experiments results reveals 83.18% F-score, which proves the validity of this method.

Select

Survey

From Vision to Text: A Brief Survey for Image Captioning

WEI Zhongyu, FAN Zhihao, WANG Ruize, CHENG Yijing, ZHAO Wangrong, HUANG Xuanjing

. 2020, 34(7): 19-29.

Abstract (3505) PDF (6624)

Knowledge map

Save

In recent years, increasing attention has been attracted to the research field related to cross-modality, especially vision and language. This survey focuses on the task of image captioning and summarizes literatures from four aspects, including the overall architecture, some key questions for cross-modality research, the evaluation of image captioning and the state-of-the-art approaches to image captioning. In conclusion, we suggest three directions for future research, i.e., cross-modality representation, automatic evaluation metrics and diverse text generation.

Select

Sentiment Analysis and Social Computing

Emotion Classification Based on CNN and EWC Algorithm for Unbalanced Texts

CHENG Yan, ZHU Hai, XIANG Guoxiong, TANG Tianwei, ZHONG Linhui, WANG Guowei

. 2020, 34(4): 92-100.

Abstract (638) PDF (3622)

Knowledge map

Save

Text emotion classification is a well-addressed task in the field of natural language processing. To deal with the unbalanced data which hurt the classification performance, this paper proposes an emotion classification method combining CNN and EWC algorithms. First, the method uses the random under-sampling method to obtain multiple sets of balanced data for training. Then it feeds each balanced dataset to CNN training in sequence, introducing EWC algorithm in the training process to overcome the catastrophic forgetting issue in CNN. Finally, the CNN model trained by the last data set is treated as the final classification model. The experimental results show that the proposed method is superior to the ensemble learning framework based on under-sampling and multi-classification algorithms, and outperforms the multi-channel LSTM neural network with 1.9% and 2.1% improvements in accuracy and G-mean, respectively.

Select

Language Analysis and Calculation

Knowledge Enhanced Pre-trained Language Model for Textual Inference

XIONG Kai , DU Li, DING Xiao , LIU Ting, QIN Bing, FU Bo

Journal of Chinese Information Processing. 2022, 36(12): 27-35.

Abstract (539) PDF (1782)

Knowledge map

Save

Although the pre-trained language model has achieved high performance on a large number of natural language processing tasks, the knowledge contained in some pre-trained language models is difficult to support more efficient textual inference. Focused on using a wealth of knowledge to enhance the pre-trained language model for textual inference, we propose a framework for textual inference to integrate the knowledge of graphs and graph structures into the pre-trained language model. Experiments on two subtasks of textual inference indicate our framework outperforms a series of baseline methods.

Select

Language Resources Construction

The Construction of an Emotion Annotated Corpus on Microblog Text

YAO Yuanlin, WANG Shuwei, XU Ruifeng, LIU Bin, GUI Lin, LU Qin, WANG Xiaolong

. 2014, 28(5): 83-91.

Abstract (1896) PDF (4987)

Knowledge map

Save

Baidu(9)

The research on text emotion analysis has made substantial progesses in recent years. However, the emotion annotated corpus is less developed, especially the ones on micro-blog text. To support the analysis on the emotion expression in Chinese micro-blog text and the evaluation of the emotion classification algorithms, an emotion annotated corpus on Chinese micro-blog text is designed and constructed. Based on the observation and analysis on the emotion expression in micro-blog text, a set of emotion annotation specification is developed. Following this specification, the emotion annotation on micro-blog level is firstly performed. The annotated information includes whether the micro-blog text has emotion expression and the emotion categories corresponding to the micro-blog with emotion expressions. Next, the sentence-level annotation is conducted. Meanwhile, the annotation on whether the sentence has emotion expression and the emotion categories, the strength corresponding to each emotion category is annotated. Currently, this emotion annotated corpus consists of 14000 micro-blogs, totaling 45431 sentences. This corpus was used as the standard resource in the NLP&CC2013 Chinese micro-blog emotion analysis evaluation, facilitating the research on emotion analysis to a great extent.

Select

Survey

Knowledge Enhancement for Pre-trained Language Models: A Survey

SUN Yi, QIU Hangping, ZHENG Yu, ZHANG Chaoran, HAO Chao

. 2021, 35(7): 10-29.

Abstract (1120) PDF (3086)

Knowledge map

Save

Introducing knowledge into data-driven artificial intelligence models is an important way to realize human-machine hybrid intelligence. The current pre-trained language models represented by BERT have achieved remarkable success in the field of natural language processing. However, the pre-trained language models are trained on large scale unstructured corpus data, and it is necessary to introduce external knowledge to alleviate its defects in determinacy and interpretability to some extent. In this paper, the characteristics and limitations of two kinds of pre-trained language models, pre-trained word embeddings and pre-trained context encoders, are analyzed. The related concepts of knowledge enhancement are explained. Four types of knowledge enhancement methods of pre-trained word embeddings are summarized and analyzed, which are pre-trained word embeddings retrofitting, hierarchizing the process of encoding and decoding, attention mechanism optimization and knowledge memory introduction. The knowledge enhancement methods of pre-training context encoders are described from two perspectives: 1) task-specific and task-agnostic; 2) explicit knowledge and implicit knowledge. Through the summary and analysis of the knowledge enhancement methods of the pre-trained language model, the basic pattern and algorithm are provided for the human-machine hybrid artificial intelligence.

Select

Survey

A Survey on Named Entity Recognition Based on Deep Learning

DENG Yiyi, WU Changxing, WEI Yongfeng, WAN Zhongbao, HUANG Zhaohua

. 2021, 35(9): 30-45.

Abstract (1333) PDF (2986)

Knowledge map

Save

Named entity recognition (NER), as one of the basic tasks in natural language processing, aims to identify the required entities and their types in unstructured text. In recent years, various named entity recognition methods based on deep learning have achieved much better performance than that of traditional methods based on manual features. This paper summarizes recent named entity recognition methods from the following three aspects: 1) A general framework is introduced, which consists of an input layer, an encoding layer and a decoding layer. 2) After analyzing the characteristics of Chinese named entity recognition, this paper introduces Chinese NER models which incorporate both character-level and word-level information. 3) The methods for low-resource named entity recognition are described, including cross-lingual transfer methods, cross-domain transfer methods, cross-task transfer methods, and methods incorporating automatically labeled data. Finally, the conclusions and possible research directions are given.

Select

Review

Sentiment Classification for Chinese News Using Machine Learning Methods

XU Jun, DING Yu-xin, WANG Xiao-long

. 2007, 21(6): 95-100.

Abstract (1431) PDF (8929)

Knowledge map

Save

Baidu(98)

In this paper, we study how to apply machine learning techniques to solve sentiment classification problems. The main task of sentiment classification is to determine whether news or reviews is negative or positive. Naive Bayes and Maximum Entropy classification are used for the sentiment classification of Chinese news and reviews. The experimental results show that the methods we employed perform well. The accuracy of classification can achieve about 90%. Moreover, we find that selecting the words with polarity as features, negation tagging and representing test documents as feature presence vectors can improve the performance of sentiment classification. Conclusively, sentiment classification is a more challenging problem.

Select

Knowledge Representation and Acquisition

Coalmine Safety: Knowledge Graph Construction and Its QA Approach

LIU Peng, YE Shuai, SHU Ya, LU Xiaolong, LIU Mingming

. 2020, 34(11): 49-59.

Abstract (788) PDF (3011)

Knowledge map

Save

Coal mining enterprises are developing beyond information construction into intelligence era, motivated by new network technologies like big data and artificial intelligence. In this paper, knowledge graph is introduced into the domain of coalmine safety. The domain knowledge concept is first classified, stored in the graph database, and visually presented for its concept relations. Then, to facilitate the query search over this knowledge graph, a question classification approach is implemented to identify the best query types for a specific question. The experiment results show that the proposed entity extraction method has higher scores on recall and precision, and the Spark-based parallel question classification algorithm significantly improves efficiency while ensuring the accuracy.

Select

Survey

Frontiers in Neural Machine Translation: A Literature Review

FENG Yang, SHAO Chenze

. 2020, 34(7): 1-18.

Abstract (4158) PDF (6974)

Knowledge map

Save

Machine translation is a task which translates a source language into a target language of the equivalent meaning via a computer, which has become an important research direction in the field of natural language processing. Neural machine translation models, as the main stream in the reasearch community, can perform end-to-end translation from source language to target language. In this paper, we select several main research directions of neural machine translation, including model training, simultaneous translation, multi-modal translation, non-autoregressive translation, document-level translation, domain adaptation, multilingual translation, and briefly introduce the research progresses in these directions.

Select

Survey

Adversarial Text Attack and Defense: A Review

DU Xiaohu, WU Hongming, YI Zibo, LI Shasha, MA Jun, YU Jie

. 2021, 35(8): 1-15.

Abstract (1509) PDF (2711)

Knowledge map

Save

Adversarial attack and defense is a popular research issue in recent years. Attackers use small modifications to generate adversarial examples to cause prediction errors from the deep neural network. The generated adversarial examples can reveal the vulnerability of the neural network, which can be repaired to improve the security and robustness of the model. This paper gives a more detailed and comprehensive introduction to the current mainstream adversarial text example attack and defense methods, the data set together with the target neural network of the mainstream attack. We also compare the differences between different attack methods in this paper. Finally, the challenges of the adversarial text examples and the prospect of future research are summarized.

Select

Survey

A Survey on Graph Contrastive Learning

CEN Keting, SHEN Huawei, CAO Qi, CHENG Xueqi

Journal of Chinese Information Processing. 2023, 37(5): 1-21.

Abstract (1548) PDF (1029)

Knowledge map

Save

As a self-supervised deep learning paradigm, contrastive learning has achieved remarkable results in computer vision and natural language processing. Inspired by the success of contrastive learning in these fields, researchers have tried to extend it to graph data and promoted the development of graph contrastive learning. To provide a comprehensive overview of graph contrastive learning, this paper summarizes recent works under a unified framework to highlight the development trends. It also catalogues the popular datasets and evaluation metrics for graph contrastive learning, and concludes with the possible future direction of the field.

Select

Survey

Research Progress of Attention Mechanism in Deep Learning

ZHU Zhangli, RAO Yuan, WU Yuan, QI Jiangnan, ZHANG Yu

. 2019, 33(6): 1-11.

Abstract (2418) PDF (7316)

Knowledge map

Save

The attention mechanism has gradually become one of the popular methods and research issues in deep learning. By improving the source language expression, it dynamically selects the related information of the source language in decoding, which greatly improves the insufficiency issue of the classic Encoder-Decoder framework. On the basis of the issues in the conventional Encoder-Decoder framework such as long-term memory limitation, interrelationships in sequence transformation, and output quality of model dynamic structure, this paper describes a varied aspects on attention mechanism, including the definition, the principle, the classification, state-of-the-art researches as well as the applications of attention mechanism in image recognition, speech recognition, and natural language processing. Meanwhile, this paper further discusses the multi-modal attention mechanism, evaluation mechanism of attention, interpretability of the model and integration of attention with the new model, providing new research issues and directions for the development of attention mechanism in deep learning.

Select

Research on Web-based Chinese Question Answering System and Answer Extraction

CUI Huan,CAI Dong-feng,MIAO Xue-lei

. 2004, 18(3): 25-32.

Abstract (795) PDF (3210)

Knowledge map

Save

Question Answering System can give users precise answer to the question presented in natural language. Currently , most of question answering systems use large scaled corpus as knowledge base to extract answer. However , the abundant web resource provides another ideal knowledge source for question answering system. The research result shows that using web resource as the information source for question answering system can get good performance for simple and factoid-based questions. This paper presents an answer extraction method based on the computation of sentence similarity between the question sentence and the candidate answer sentence. We also developed a web-based Chinese QA system. This system only utilizes the“text snippet”in the feedback of the web search engine as data resource for answer extraction. The experiment result indicates that the system can get relatively good results for the questions of the types of PERSON , TIME and NUMBER; the MRR of all questions is 0.51.

Select

Survey

A Survey of Prompt Learning Combined with Knowledge

BAO Chenlong, LYU Mingyang, TANG Jintao, LI Shasha, WANG Ting

Journal of Chinese Information Processing. 2023, 37(7): 1-12.

Abstract (776) PDF (953)

Knowledge map

Save

In recent years, prompt learning methods have attracted more attention from researchers for exploiting the pre-trained language models. To optimize the performance of prompt learning, researchers explored the template engineering and the answer engineering which are both based on knowledge. In this paper, the researches on prompt learning combined with knowledge are reviewed systematically, with a focus on the prompt learning methods in knowledge extraction. We also reveal the constraints in these method and discuss the possible future developments.

Select

Survey

Social Bot Account Detection on Microblog: A Survey

ZHANG Xuan, LI Baobin

Journal of Chinese Information Processing. 2022, 36(12): 1-15.

Abstract (830) PDF (1158)

Knowledge map

Save

Social bots in microblog platforms significantly impact information dissemination and public opinion stance. This paper reviews the recent researches on social bot account detection in microblogs, especially Twitter and Weibo. The popular methods for data acquisition and feature extraction are reviewed. Various bot detection algorithms are summarized and evaluated, including approaches based on statistical methods, classical machine learning methods, and deep learning methods. Finally, some suggestions for future research are anticipated.

Select

Sentiment Analysis and Social Computing

Multi-modal Emotion Recognition Based on Multi-LSTMs Fusion

ZHANG Yawei, WU Liangqing, WANG Jingjing, LI Shoushan

. 2022, 36(5): 145-152.

Abstract (950) PDF (1792)

Knowledge map

Save

Sentiment analysis is a popular research issue in the field of natural language processing, and multimodal sentiment analysis is the current challenge in this task. Existing studies are defected in capturing context information and combining information streams of different models. This paper proposes a novel multi-LSTMs Fusion Model Network (MLFN), which performs deep fusion between the three modalities of text, voice and image via the internal feature extraction layer for single-modal, and the inter-modal fusion layer for dual-modal and tri-modal. This hierarchical LSTM framework takes into account the information features inside the modal while capturing the interaction between the modals. Experimental results show that the proposed method can better integrate multi-modal information, and significantly improve the accuracy of multi-modal emotion recognition.

Select

Review

The development and use of machine translation systems and computer-based translation tools

John Hutchins

. 1999, 13(6): 2-14.

Abstract (737) PDF (2376)

Knowledge map

Save

This survey of the present demand and use of computer-based translation software concentrates on systems designed for the production of translations of publishable quality , including developments in controlled language systems , translator workstations , and localisation ; but it covers also the developments of software for non-translators , in particular for use with Web pages and other Internet applications , and it looks at future needs and systems under development . The final section compares the types of translations that can be met most appropriately by human and by machine (and computer-aided) translation respectively.

Select

Information Extraction and Text Mining

Overview of CHIP 2020 Shared Task 2: Entity and Relation Extraction in Chinese Medical Text

GAN Zifa, ZAN Hongying, GUAN Tongfeng, LI Wenxin, ZHANG Huan, ZHU Tiantian, SUI Zhifang, CHEN Qingcai

. 2022, 36(6): 101-108.

Abstract (769) PDF (1371)

Knowledge map

Save

The 6th China conference on Health Information Processing (CHIP 2020) organized six shared tasks in Chinese medical information processing. The second task was entity and relation extraction that automatically extracts the triples consist of entities and relations from Chinese medical texts. A total of 174 teams signed up for the task, and eventually 17 teams submitted 42 system runs. According to micro-average F₁ which was the key evaluation criterion in the task, the top performance of the submitted results reaches 0.648 6.

Select

Survey

A Survey of Automatic Error Correction of Chinese Text

Li Yunhan, Shi Yunmei, Li Ning, Tian Ying'ai

. 2022, 36(9): 1-18,27.

Abstract (933) PDF (1442)

Knowledge map

Save

Text correction, an important research field in Natural Language Processing (NLP), is of great application value in fields such as news, publication, and text input . This paper provides a systematic overview of automatic error correction technology for Chinese texts. Errors in Chinese texts are divided into spelling errors, grammatic errors and semantic errors, and the methods of error correction for these three types are reviewed. Moreover, datasets and evaluation methods of automatic error correction for Chinese texts are summarized. In the end, prospects for the automatic error correction for Chinese texts are raised.

Select

Survey

A Survey of Natural Language Generation in Task-Oriented Dialogue System

QIN Libo, LI Zhouyang, LOU Jieming, YU Qiying, CHE Wanxiang

. 2022, 36(1): 1-11,20.

Abstract (973) PDF (1957)

Knowledge map

Save

Natural Language Generation in a task-oriented dialogue system (ToDNLG) aims to generate natural language responses given the corresponding dialogue acts, which has attracted increasing research interest. With the development of deep neural networks and pre-trained language models, great success has been witnessed in the research of ToDNLG field. We present a comprehensive survey of the research field, including: (1) a systematical review on the development of NLG in the past decade, covering the traditional methods and deep learning-based methods; (2) new frontiers in emerging areas of complex ToDNLG as well as the corresponding challenges; (3) rich open-source resources, including the related papers, baseline codes and the leaderboards on a public website. We hope the survey can promote future research in ToDNLG.

Select

Survey

Natural Language Understanding for Legal Text: A Review

AN Zhenwei, LAI Yuxuan, FENG Yansong

. 2022, 36(8): 1-11.

Abstract (1196) PDF (1492)

Knowledge map

Save

In recent years, legal artificial intelligence has attracted increasing attention for its efficiency and convenience. Among others, legal text is the most common manifestation in legal practice, thus, using natural language understanding method to automatically process legal text is an important direction for both academia and industry. In this paper, we provide a gentle survey to summarize recent advances on natural language understanding for legal texts. We first introduce the popular task setups, including legal information extraction, legal case retrieval, legal question answering, legal text summarization, and legal judgement prediction. We further discuss the main challenges from three perspectives: understanding the difference of languages between legal domain and open domain, understanding the rich argumentative texts in legal documents, and incorporating legal knowledge into existing natural language processing models.

Select

Survey

A Study on the Credibility of Information Spreaded on Social Networks

WU Lianwei, RAO Yuan, FAN Xiaobing, YANG Hao

. 2018, 32(2): 1-11,21.

Abstract (893) PDF (2647)

Knowledge map

Save

There are a large number of rumors, extreme and fake news in network, which will reduce quality of information, destroy the credible atmosphere of internet, and produce the serious negative effects for the occurrence and development of public opinion. To measure the credibility of information, the paper divides incredibility contents into such types of extreme emergency information, network extreme information, network rumors, misinformation, disinformation, spam information and so on. And the information contents are studied from the following aspects: concept, content features description, credibility modeling and credibility evaluation, which provides a solid foundation for credibility analysis and measurement of information content in social networks. Finally, we further analyze the directions of development in current research of credibility of information.

Select

Knowledge Representation and Acquisition

Knowledge Representation and Sentence Segmentation of Ancient Chinese Based on Deep Language Models

HU Renfen, LI Shen, ZHU Yuchen

. 2021, 35(4): 8-15.

Abstract (727) PDF (1789)

Knowledge map

Save

Sentence segmentation of ancient Chinese texts is a very difficult task even for experts in this area, since it not only relies on the sentence meaning and the contextual information, but also requires historical and cultural knowledge. This paper proposes to build knowledge representation of ancient Chinese with BERT, a deep language model, and then construct the sentence segmentation model with Conditional Random Field and Convolutional Neural Networks. Our model achieves significant improvements in all of the three ancient text styles. It achieves 99%, 95% and 92% F₁ scores for poems, lyrics and prose texts, respectively, out-performing Bi-GRU by 10% in lyrics and proses which are more difficult to segment. In further case studies, the method achieves good results in the difficult cases in published ancient books.

Select

The Key Technologies of Educational Cognition for Humanlike Intelligence

Improving Chinese Automated Essay Scoring via Deep Language Analysis

WEI Si, GONG Jiefu, WANG Shijin, SONG Wei, SONG Ziyao

. 2022, 36(4): 111-123.

Abstract (664) PDF (1325)

Knowledge map

Save

Automated essay scoring is a significant and challenging research topic, which has attracted the attention of scholars in the fields of artificial intelligence and education. Focuses on Chinese automated essay scoring, this paper proposes to exploit deep language analysis, including the application of spelling error corrector and grammar error corrector to analyze grammar level writing ability, the automatic rhetorical analysis and excellent expression recognition to reflect language expression ability, and the fine-grained quality analysis of essay to evaluate overall quality. We then propose an adaptive hybrid scoring model, combining linguistic features and deep neural networks. The experimental results on Chinese student essay datasets show that 1) incorporating deep language analysis features can effectively improve the performance of automated essay scoring; and 2) the grade and topic adaptive training strategy also improves the transferring and predication abilities.

Select

Sentiment Analysis and Social Computing

Multimodal Hierarchical Dynamic Routing for Depression Detection

AN Minghui, WANG Jingjing, LIU Qiyuan, LI Linqin, ZHANG Daxin, LI Shoushan

. 2022, 36(1): 154-162.

Abstract (663) PDF (1349)

Knowledge map

Save

As a cross-domain research task, depression detection using multimodal information has recently received considerable attention from researchers in several communities, such as natural language processing, computer vision, and mental health analysis. These studies mainly utilize the user-generated contents on social media to perform depression detection. However, existing approaches have difficulty in modeling long-range dependencies(global information). Therefore, how to obtain global user information has become an urgent problem. In addition, considering that social media contains not only textual but also visual information, how to fuse global information in different modalities has become another urgent problem. To overcome the above challenges, we propose a multimodal hierarchical dynamic routing approach for depression detection. We obtain global user information from hierarchical structure and use dynamic routing policy to fuse text and image modalities which can adjust and refine message to detect depression. Empirical results demonstrate the impressive effectiveness of the proposed approach in capturing the global user information and fusing multimodal information to improve the performance of depression detection.

Select

Survey

Document AI: Benchmarks, Models and Applications

CUI Lei, XU Yiheng, LYU Tengchao, WEI Furu

. 2022, 36(6): 1-19.

Abstract (2564) PDF (2148)

Knowledge map

Save

Document AI, or Document Intelligence, is a relatively new research topic that refers to the techniques to automatically read, understand and analyze business documents. It is an important interdisciplinary study involving natural language processing and computer vision. In recent years, the popularity of deep learning technology has greatly advanced the development of Document AI tasks, such as document layout analysis, document information extraction, document visual question answering, and document image classification etc. This paper briefly introduces the early-stage heuristic rule-based document analysis, statistical machine learning based algorithms, as well as the deep learning-based approaches especially the pre-training approaches. Finally, we also look into the future direction of Document AI.

Select

Survey

Preliminary Study on the Construction of Chinese Medical Knowledge Graph

BYAMBASUREN Odmaa, YANG Yunfei, SUI Zhifang, DAI Damai, CHANG Baobao, LI Sujian, ZAN Hongying

. 2019, 33(10): 1-7.

Abstract (3902) PDF (3906)

Knowledge map

Save

The medical knowledge graph is the cornerstone of intelligent medical applications. The existing medical knowledge graphs are not enough from the perspectives of scale, specification, taxonomy, formalization as well as the precise description of the knowledge to meet the needs of intelligent medical applications. We apply natural language processing and text mining techniques with a semi-automated approach to develop the Chinese Medical Knowledge Graph (CMeKG 1.0) . The construction of CMeKG 1.0 refers to the international medical coding systems such as ICD-10, ATC, and MeSH, as well as large-scale, multi-source heterogeneous clinical guidelines, medical standards, diagnostic protocols, and medical encyclopedia resources. CMeKG covers types such as diseases, drugs, and diagnosis/treatment technologies, with more than 1 million medical concept relationships. This paper presents the description system, key technologies, construction process and medical knowledge description of CMeKG 1.0, serving as a reference for the construction and application of knowledge graphs in the medical field.

Select

Survey

Survey on Deep Learning Based Popularity Prediction

CAO Qi, SHEN Huawei, GAO Jinhua, CHENG Xueqi

. 2021, 35(2): 1-18,32.

Abstract (1784) PDF (3513)

Knowledge map

Save

Popularity prediction over online social networks plays an important role in various applications, e.g., recommendation, advertising, and information retrieval. Recently, the rapid development of deep learning and the availability of information diffusion data provide a solid foundation for deep learning based popularity prediction research. Existing surveys of popularity prediction mainly focus on traditional popularity prediction methods. To systematically summarize the deep learning based popularity prediction methods, this paper reviews existing popularity prediction methods based on deep learning, categorizes the recent deep learning based popularity prediction research into deep representation based and deep fusion based methods, and discusses the future researches.

Select

Ethnic Language Processing and Cross Language Processing

Pre-trained Language Model Based Tibetan Text Classification

AN Bo, LONG Congjun

. 2022, 36(12): 85-93.

Abstract (589) PDF (973)

Knowledge map

Save

Tibetan text classification is a fundamental task in Tibetan natural language processing. The current mainstream text classification model is a large-scale pre-training model plus fine-tuning. However, Tibetan lacks open source large-scale text and pre-training language model, and cannot be verified on Tibetan text classification task. This paper crawls a large Tibetan text dataset to solve the above problems and trains a Tibetan pre-training language model (BERT-base-Tibetan) based on this dataset. Experimental results show that the pre-training language model can significantly improve the performance of Tibetan text classification (F₁ value increases by 9.3% on average) and verify the value of the pre-training language model in Tibetan text classification tasks.

Select

Speech Recognition and Analysis

Introduction to Automatic Labeling/Retrieving System for Acoustic Parameters

ZHOU Xuewen, HU He

. 2014, 28(3): 123-128.

Abstract (980) PDF (2007)

Knowledge map

Save

This paper presents an Automatic Labeling/Retrieving system for acoustic parameters. By using the system, phnetic analysts may dramaticlly deduce errors in labeling and retrieving acoustic parameters, improve working efficiency, ensure repeatbility and verifibility of phonetic data and promote standarization in establishing acoustic parameter databases.

Select

Knowledge Representation and Acquisition

Construction of Knowledge Graph Based on Geo-Spatial Data

LIU Junnan, LIU Haiyan, CHEN Xiaohui, GUO Xuan, ZHU Xinming

. 2020, 34(11): 29-36.

Abstract (1071) PDF (2848)

Knowledge map

Save

With the rapid development of 3S technology, there is an explosive growth in geo-spatial data. It has become an urgent scientific problem to construct knowledge graph based on geo-spatial data so as to realize the transformation from data to knowledge. In the general knowledge graph, geo-spatial knowledge is only represented for the attribute or the semantic relationship, with the spatial relationship missed. This paper first designs the representation method of spatial relationship. Then, it proposes the technical map of knowledge graph construction based on spatial relationship, focusing on the spatial relationship extraction and multi-source geographic data fusion. We also discuss the application direction of knowledge graph in the field of geo-spatial: promoting the integration of geo-spatial data and semantic web technologies.

Select

Sentiment Analysis and Social Computing

Multimodal Sentiment Analysis Based on Multilevel Feature Fusion Attention Network

WANG Jinghao, LIU Zhen, LIU Tingting , WANG Yuanyi, CHAI Yanjie

. 2022, 36(10): 145-154.

Abstract (570) PDF (1000)

Knowledge map

Save

Existing methods for sentiment analysis in social media usually deal with single modal data, without capturing the relationship between multimodal information. This paper propose to treat the hierarchical structure relations between texts and images in social media as complementarity. This paper designs a multi-level feature fusion attention network to capture both the ‘images-text’ and the ‘text-images’ relations to perceive the user’s sentiments in social media. Experimental results on Yelp and MultiZOL datasets show that this method can effectively improve the sentiment classification accuracy for multimodal data.

Select

Sentiment Analysis and Social Computing

Reinforcement Learning with User Long-term and Short-term Preference for Personalized Recommendation

YAN Shihong, MA Weizhi, ZHANG Min, LIU Yiqun, MA Shaoping

. 2021, 35(8): 107-116.

Abstract (698) PDF (1894)

Knowledge map

Save

Most of the existing recommendation methods based on deep reinforcement learning use recurrent neural network (RNN) to learn users' short-term preference, while ignoring their long-term preference. This paper proposes a deep reinforcement learning recommendation method combining both long-term and short-term user preference (LSRL). LSRL uses collaborative filtering to learn users' long-term preference representation and applies the gated recurrent unit (GRU) to learn user's short-term preference representation. The redesigned Q-network framework combines two types of representation and Deep Q-Network is used to predict users' feedback on items. Experimental results on MovieLens datasets show that the proposed method has a significant improvement according to NDCG and Hit Ratio compared to other baseline methods.

Select

Information Extraction and Text Mining

Chinese Biomedical Entity Relation Extraction System Based on Deep Learning

DING Zeyuan, YANG Zhihao, LUO Ling, WANG Lei, ZHANG Yin, LIN Hongfei, WANG Jian

. 2021, 35(5): 70-76.

Abstract (878) PDF (2129)

Knowledge map

Save

In the field of biomedical text mining, biomedical named entity recognition and relations extraction are of great significance. This paper builds a Chinese biomedical entity relation extraction system based on deep learning technology. Firstly, Chinese biomedical entity relation corpus is construction from the publicly available English biomedical annotated corpora via translation and manual annotation. Then this paper applies the ELMo (Embedding from Language Model) trained in Chinese biomedical text to the Bi-directional LSTM (BiLSTM) combined conditional random fields (CRF) model for Chinese entity recognition. Finally, the relation between entities is extracted using BiLSTM combined with the Attention mechanism. The experimental results show that the system can accurately extract biomedical entities and inter-entity relation from Chinese text.

Select

Sentiment Analysis and Social Computing

Dual Emotion-aware Method for Interpretable Rumor Detection

GE Xiaoyi, ZHANG Mingshu, WEI Bin, LIU Jia

. 2022, 36(9): 129-138.

Abstract (784) PDF (1059)

Knowledge map

Save

The identification of rumors is of substantial significance research value. Current deep learning-based solution brings excellent results, but fails in capturing the relationship between emotion and semantics or providing emotional explanations. This paper proposes a dual emotion-aware method for interpretable rumor detection, aiming to provide a reasonable explanation from an emotional point of view via co-attention weights. Compared with contrast model, the accuracy is increased by 3.9%,3.3% and 4.4% on the public Twitter15, Twitter16, and Weibo20 datasets.

Please choose a citation manager

Content to export