基于稀疏主成分分析的非正式语词的心理-人格特征研究

钟 毓;费定舟

PDF(3208 KB)
PDF(3208 KB)
中文信息学报 ›› 2017, Vol. 31 ›› Issue (1) : 192-204.
语言分析与计算

基于稀疏主成分分析的非正式语词的心理-人格特征研究

  • 钟 毓,费定舟
作者信息 +

Judging Personality by Informal Words: a Sparse PCA Approach

  • ZHONG Yu, FEI Dingzhou
Author information +
History +

摘要

针对社会媒体中非正式文本的数据分析经常出现的稀疏数据矩阵,在应用文本分析工具的基础上使用稀疏主成分分析这一特征,降维分析方法分析现实情况下聊天文本中非正式语词表现的认知语用特征、描述非正式语词与人格的关系。使用短文本主题模型、心理距离问卷、大五人格问卷测量人格和背景变量,使用计算机文本分析工具对被试提供的即时聊天文本内的语词计频,使用简体中文版语词查询与字词计数字典和认知语用学对稀疏主成分分析后非正式语词维度进行特征表征。在非正式语词降维上,稀疏主成分分析比主成分分析在因子载荷数上更稳定,在累积方差解释率上也相对更优(24.54% >23.40%);降维所得的6因子中“主观评价”与宜人性正相关(r0.05=.16, p =.03<0.05),“随意社交”与宜人性负相关(r0.05=-.16, p=.03<0.05),“认知愉悦”与性别显著正相关(r0.05=.43, p=.00<0.001)。使用稀疏主成分分析对非正式语词的降维效果较好,并且比较简体中文版语词查询与字词计数字典的非正式语词维度和降维后所得非正式语词维度,两者在和人格的相关上是相符的,且后者能探索出更多信息。

Abstract

In this paper, a new method is presented to identify personality with dimension reduction by sparse principal component analysis (SPCA). Based on categories of linguistic inquiry and word count dictionary (LIWC), informal words usage and psychological trait in instant chat is analyzed, and the relation between informal words and personality is described. Biterm Text Model (BTM), psychological distance questionnaire and Big Five personality questionnaire are used to measure personality and related variables. The informal words dimensions are explained based on simplified Chinese version of linguistic inquiry and word count dictionary and cognitive linguistic usage. It is shown that the numbers of load factors gotten by the SPCA more stable than the numbers of traditional principal component analysis(PCA), and the cumulative explained variances are better (24.54%>23.40%). With respect to 6 dimensions, “subjective evaluation” was positively related to agreeableness (r0.05=.16, p=.03<0.05), “casual socializing” was negatively related to agreeableness (r0.05=-.16, p=.03<0.05), while “cognitive pleasure” and gender were significantly positively related (r0.05=.43, p=.00<0.001). These results suggest that SPAC for dimensional reduction performs better PCA in related studied issues.

关键词

文本分析 / 稀疏主成分分析 / 非正式语词

Key words

text analysis / sparse principal component analysis / informal words

引用本文

导出引用
钟 毓;费定舟. 基于稀疏主成分分析的非正式语词的心理-人格特征研究. 中文信息学报. 2017, 31(1): 192-204
ZHONG Yu; FEI Dingzhou. Judging Personality by Informal Words: a Sparse PCA Approach. Journal of Chinese Information Processing. 2017, 31(1): 192-204

参考文献

[1] Allport G W, Odbert H S. Trait-names: A psycho-lexical study[J]. Psychological monographs, 1936, 47(1): i.
[2] Saucier G, Goldberg L R. Assessing the Big Five: Applications of 10 psychometric criteria to the development of marker scales[J]. Big five assessment, 2002: 29-58.
[3] Laserna C M, Seih Y T, Pennebaker J W. Um... Who Like Says You Know Filler Word Use as a Function of Age, Gender, and Personality[J]. Journal of Language and Social Psychology, 2014: 0261927X14526993.
[4] Irvine C A, Eigsti I M, Fein D A. Uh, Um, and Autism: Filler Disfluencies as Pragmatic Markers in Adolescents with Optimal Outcomes from Autism Spectrum Disorder[J]. Journal of autism and developmental disorders, 2015: 1-10.
[5] Ervin-Tripp S. An analysis of the interaction of language, topic, and listener[J]. American Anthropologist, 1964, 66(6_PART2): 86-102.
[6] Fishman J A. Who speaks what language to whom and when?[J]. La linguistique, 1965, 1(Fasc. 2): 67-88.
[7] Chomsky N. Knowledge of language: Its nature, origin, and use[M]. Greenwood Publishing Group, 1986.
[8] Bybee J. Phonology and language use[M]. Cambridge University Press, 2003.
[9] Pennebaker J W, King L A. Linguistic styles: language use as an individual difference[J]. Journal of personality and social psychology, 1999, 77(6): 1296.
[10] Pennebaker J W, Mehl M R, Niederhoffer K G. Psychological aspects of natural language use: Our words, our selves[J]. Annual review of psychology, 2003, 54(1): 547-577.
[11] Slatcher R B, Chung C K, Pennebaker J W, et al. Winning words: Individual differences in linguistic style among US presidential and vice presidential candidates[J]. Journal of Research in Personality, 2007, 41(1): 63-75.
[12] Chung C K, Pennebaker J W. Revealing dimensions of thinking in open-ended self-descriptions: An automated meaning extraction method for natural language[J]. Journal of Research in Personality, 2008, 42(1): 96-132.
[13] Holtgraves T. Text messaging, personality, and the social context[J]. Journal of research in personality, 2011, 45(1): 92-99.
[14] Tausczik Y R, Pennebaker J W. The psychological meaning of words: LIWC and computerized text analysis methods[J]. Journal of language and social psychology, 2010, 29(1): 24-54.
[15] Pennebaker J W, Graybeal A. Patterns of natural language use: Disclosure, personality, and social integration[J]. Current Directions in Psychological Science, 2001, 10(3): 90-93.
[16] Lee C H, Kim K, Seo Y S, et al. The relations between personality and language use[J]. The Journal of general psychology, 2007, 134(4): 405-413.
[17] Hirsh J B, Peterson J B. Personality and language use in self-narratives[J]. Journal of research in personality, 2009, 43(3): 524-527.
[18] Oberlander J, Gill A J. Language with character: A stratified corpus comparison of individual differences in e-mail communication[J]. Discourse Processes, 2006, 42(3): 239-270.
[19] Pennebaker J W, Boyd R L, Jordan K, et al. The Development and Psychometric Properties of LIWC2015[J]. UT Faculty/Researcher Works, 2015.
[20] Zhao D, Rosson M B. How and why people Twitter: the role that micro-blogging plays in informal communication at work[C]//Proceedings of the ACM 2009 international conference on Supporting group work. ACM, 2009: 243-252.
[21] Mairesse F, Walker M. Words mark the nerds: Computational models of personality recognition through language[C]//Proceedings of the 28th Annual Conference of the Cognitive Science Society. 2006: 543-548.
[22] Küfner A C P, Back M D, Nestler S, et al. Tell me a story and I will tell you who you are! Lens model analyses of personality and creative writing[J]. Journal of Research in Personality, 2010, 44(4): 427-435.
[23] Fullwood C, Quinn S, Chen-Wilson J, et al. Put on a smiley face: textspeak and personality perceptions[J]. Cyberpsychology, Behavior, and Social Networking, 2015, 18(3): 147-151.
[24] 黃金兰, Chung C K, Hui N, et al. 中文版[语文探索与字词计算] 词典之建立[J]. The Development of the Chinese Linguistic Inquiry and Word Count Dictionary]. 中华心理学刊, 2012, 54(2): 185-201.
[25] 殷树林. 现代汉语话语标记研究[M]. 中国社会科学出版社, 2012.
[26] 李成团. 话语标记语 “嘛” 的语用功能[J]. 现代外语, 2008 (2): 150-156.
[27] Zhu J, Zhu M, Wang Q, et al. NiuParser: A Chinese Syntactic and Semantic Parsing Toolkit[J]. ACL-IJCNLP 2015, 2015: 145.
[28] Yan X, Guo J, Lan Y, et al. A biterm topic model for short texts[C]//Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2013: 1445-1456.
[29] Rosen-Zvi M, Griffiths T, Steyvers M, et al. The author-topic model for authors and documents[C]//Proceedings of the 20th conference on Uncertainty in artificial intelligence. AUAI Press, 2004: 487-494.
[30] 牛忠辉, 蒋赛, 邱俊杰, 等. 社会距离对他人行为表征的影响: 评价内容效价的作用[J]. 应用心理学, 2011, 16(4): 291-300.
[31] John O P, Donahue E M, Kentle R. ‘The ‘‘Big Five[J]. inventory—version 4a and, 1991, 54.
[32] John O P, Naumann L P, Soto C J. Paradigm shift to the integrative big five trait taxonomy[J]. Handbook of personality: Theory and research, 2008, 3: 114-158.
[33] 冉永平. 话语标记语的语用学研究综述[J]. 外语研究, 2000 (4): 8-14.
[34] 孙利萍, 方清明. 汉语话语标记的类型及功能研究综观[J]. 汉语学习, 2011 (6): 76-84.
[35] Karen Tao Lok Sum. A study of the non-verbal politeness strategies in online chat conversations[D], 2013.
[36] 张明宇. 汉字 “好” 的语义功能研究[D]. 上海外国语大学硕士学位论文, 2008.
[37] 姜其文. 试论主观增量标记 “好” 及其语用功能[J]. 励耘语言学刊, 2015 (2): 185-197.
[38] 殷治纲, 李爱军. “嗯”,“啊” 类话语标记研究[C].中国计算技术与语言问题研究——第七届中文信息处理国际会议论文集. 2007.
[39] Robertson K, Murachver T. Intimate partner violence linguistic features and accommodation behavior of perpetrators and victims[J]. Journal of Language and Social Psychology, 2006, 25(4): 406-422.
[40] Hirsh J B, Peterson J B. Personality and language use in self-narratives[J]. Journal of research in personality, 2009, 43(3): 524-527.
[41] Tiejun W. A Review on the Study of the Concept of Mianzi and Its Function[J]. Psychological Science, 2004, 4: 040.
[42] Hall J A, Pennington N, Lueders A. Impression management and formation on Facebook: A lens model approach[J]. New Media & Society, 2013: 1461444813495166.
[43] Panksepp J, Burgdorf J. “Laughing” rats and the evolutionary antecedents of human joy?[J]. Physiology & behavior, 2003, 79(3): 533-547.
[44] Smolewska K A, McCabe S B, Woody E Z. A psychometric evaluation of the Highly Sensitive Person Scale: The components of sensory-processing sensitivity and their relation to the BIS/BAS and “Big Five”[J]. Personality and Individual Differences, 2006, 40(6): 1269-1279.
[45] Zou H, Hastie T, Tibshirani R. Sparse principal component analysis[J]. Journal of computational and graphical statistics, 2006, 15(2): 265-286.
PDF(3208 KB)

657

Accesses

0

Citation

Detail

段落导航
相关文章

/