Abstract:In this paper, a new method is presented to identify personality with dimension reduction by sparse principal component analysis (SPCA). Based on categories of linguistic inquiry and word count dictionary (LIWC), informal words usage and psychological trait in instant chat is analyzed, and the relation between informal words and personality is described. Biterm Text Model (BTM), psychological distance questionnaire and Big Five personality questionnaire are used to measure personality and related variables. The informal words dimensions are explained based on simplified Chinese version of linguistic inquiry and word count dictionary and cognitive linguistic usage. It is shown that the numbers of load factors gotten by the SPCA more stable than the numbers of traditional principal component analysis(PCA), and the cumulative explained variances are better (24.54%>23.40%). With respect to 6 dimensions, “subjective evaluation” was positively related to agreeableness (r0.05=.16, p=.03<0.05), “casual socializing” was negatively related to agreeableness (r0.05=-.16, p=.03<0.05), while “cognitive pleasure” and gender were significantly positively related (r0.05=.43, p=.00<0.001). These results suggest that SPAC for dimensional reduction performs better PCA in related studied issues.
[1] Allport G W, Odbert H S. Trait-names: A psycho-lexical study[J]. Psychological monographs, 1936, 47(1): i. [2] Saucier G, Goldberg L R. Assessing the Big Five: Applications of 10 psychometric criteria to the development of marker scales[J]. Big five assessment, 2002: 29-58. [3] Laserna C M, Seih Y T, Pennebaker J W. Um... Who Like Says You Know Filler Word Use as a Function of Age, Gender, and Personality[J]. Journal of Language and Social Psychology, 2014: 0261927X14526993. [4] Irvine C A, Eigsti I M, Fein D A. Uh, Um, and Autism: Filler Disfluencies as Pragmatic Markers in Adolescents with Optimal Outcomes from Autism Spectrum Disorder[J]. Journal of autism and developmental disorders, 2015: 1-10. [5] Ervin-Tripp S. An analysis of the interaction of language, topic, and listener[J]. American Anthropologist, 1964, 66(6_PART2): 86-102. [6] Fishman J A. Who speaks what language to whom and when?[J]. La linguistique, 1965, 1(Fasc. 2): 67-88. [7] Chomsky N. Knowledge of language: Its nature, origin, and use[M]. Greenwood Publishing Group, 1986. [8] Bybee J. Phonology and language use[M]. Cambridge University Press, 2003. [9] Pennebaker J W, King L A. Linguistic styles: language use as an individual difference[J]. Journal of personality and social psychology, 1999, 77(6): 1296. [10] Pennebaker J W, Mehl M R, Niederhoffer K G. Psychological aspects of natural language use: Our words, our selves[J]. Annual review of psychology, 2003, 54(1): 547-577. [11] Slatcher R B, Chung C K, Pennebaker J W, et al. Winning words: Individual differences in linguistic style among US presidential and vice presidential candidates[J]. Journal of Research in Personality, 2007, 41(1): 63-75. [12] Chung C K, Pennebaker J W. Revealing dimensions of thinking in open-ended self-descriptions: An automated meaning extraction method for natural language[J]. Journal of Research in Personality, 2008, 42(1): 96-132. [13] Holtgraves T. Text messaging, personality, and the social context[J]. Journal of research in personality, 2011, 45(1): 92-99. [14] Tausczik Y R, Pennebaker J W. The psychological meaning of words: LIWC and computerized text analysis methods[J]. Journal of language and social psychology, 2010, 29(1): 24-54. [15] Pennebaker J W, Graybeal A. Patterns of natural language use: Disclosure, personality, and social integration[J]. Current Directions in Psychological Science, 2001, 10(3): 90-93. [16] Lee C H, Kim K, Seo Y S, et al. The relations between personality and language use[J]. The Journal of general psychology, 2007, 134(4): 405-413. [17] Hirsh J B, Peterson J B. Personality and language use in self-narratives[J]. Journal of research in personality, 2009, 43(3): 524-527. [18] Oberlander J, Gill A J. Language with character: A stratified corpus comparison of individual differences in e-mail communication[J]. Discourse Processes, 2006, 42(3): 239-270. [19] Pennebaker J W, Boyd R L, Jordan K, et al. The Development and Psychometric Properties of LIWC2015[J]. UT Faculty/Researcher Works, 2015. [20] Zhao D, Rosson M B. How and why people Twitter: the role that micro-blogging plays in informal communication at work[C]//Proceedings of the ACM 2009 international conference on Supporting group work. ACM, 2009: 243-252. [21] Mairesse F, Walker M. Words mark the nerds: Computational models of personality recognition through language[C]//Proceedings of the 28th Annual Conference of the Cognitive Science Society. 2006: 543-548. [22] Küfner A C P, Back M D, Nestler S, et al. Tell me a story and I will tell you who you are! Lens model analyses of personality and creative writing[J]. Journal of Research in Personality, 2010, 44(4): 427-435. [23] Fullwood C, Quinn S, Chen-Wilson J, et al. Put on a smiley face: textspeak and personality perceptions[J]. Cyberpsychology, Behavior, and Social Networking, 2015, 18(3): 147-151. [24] 黃金兰, Chung C K, Hui N, et al. 中文版[语文探索与字词计算] 词典之建立[J]. The Development of the Chinese Linguistic Inquiry and Word Count Dictionary]. 中华心理学刊, 2012, 54(2): 185-201. [25] 殷树林. 现代汉语话语标记研究[M]. 中国社会科学出版社, 2012. [26] 李成团. 话语标记语 “嘛” 的语用功能[J]. 现代外语, 2008 (2): 150-156. [27] Zhu J, Zhu M, Wang Q, et al. NiuParser: A Chinese Syntactic and Semantic Parsing Toolkit[J]. ACL-IJCNLP 2015, 2015: 145. [28] Yan X, Guo J, Lan Y, et al. A biterm topic model for short texts[C]//Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2013: 1445-1456. [29] Rosen-Zvi M, Griffiths T, Steyvers M, et al. The author-topic model for authors and documents[C]//Proceedings of the 20th conference on Uncertainty in artificial intelligence. AUAI Press, 2004: 487-494. [30] 牛忠辉, 蒋赛, 邱俊杰, 等. 社会距离对他人行为表征的影响: 评价内容效价的作用[J]. 应用心理学, 2011, 16(4): 291-300. [31] John O P, Donahue E M, Kentle R. ‘The ‘‘Big Five[J]. inventory—version 4a and, 1991, 54. [32] John O P, Naumann L P, Soto C J. Paradigm shift to the integrative big five trait taxonomy[J]. Handbook of personality: Theory and research, 2008, 3: 114-158. [33] 冉永平. 话语标记语的语用学研究综述[J]. 外语研究, 2000 (4): 8-14. [34] 孙利萍, 方清明. 汉语话语标记的类型及功能研究综观[J]. 汉语学习, 2011 (6): 76-84. [35] Karen Tao Lok Sum. A study of the non-verbal politeness strategies in online chat conversations[D], 2013. [36] 张明宇. 汉字 “好” 的语义功能研究[D]. 上海外国语大学硕士学位论文, 2008. [37] 姜其文. 试论主观增量标记 “好” 及其语用功能[J]. 励耘语言学刊, 2015 (2): 185-197. [38] 殷治纲, 李爱军. “嗯”,“啊” 类话语标记研究[C].中国计算技术与语言问题研究——第七届中文信息处理国际会议论文集. 2007. [39] Robertson K, Murachver T. Intimate partner violence linguistic features and accommodation behavior of perpetrators and victims[J]. Journal of Language and Social Psychology, 2006, 25(4): 406-422. [40] Hirsh J B, Peterson J B. Personality and language use in self-narratives[J]. Journal of research in personality, 2009, 43(3): 524-527. [41] Tiejun W. A Review on the Study of the Concept of Mianzi and Its Function[J]. Psychological Science, 2004, 4: 040. [42] Hall J A, Pennington N, Lueders A. Impression management and formation on Facebook: A lens model approach[J]. New Media & Society, 2013: 1461444813495166. [43] Panksepp J, Burgdorf J. “Laughing” rats and the evolutionary antecedents of human joy?[J]. Physiology & behavior, 2003, 79(3): 533-547. [44] Smolewska K A, McCabe S B, Woody E Z. A psychometric evaluation of the Highly Sensitive Person Scale: The components of sensory-processing sensitivity and their relation to the BIS/BAS and “Big Five”[J]. Personality and Individual Differences, 2006, 40(6): 1269-1279. [45] Zou H, Hastie T, Tibshirani R. Sparse principal component analysis[J]. Journal of computational and graphical statistics, 2006, 15(2): 265-286.