我们在歌词上做了一些传统的自然语言处理相关的实验。歌词是歌曲语义上的重要表达,因此,对歌词的分析可以作为歌曲音频处理的互补。我们利用齐夫定律对歌词语料库的字和词进行统计特征的考察,实验表明,其分布基本符合齐夫定律。利用向量空间模型的表示,我们可以找到比较相似的歌词集合。另外,我们探讨了如何利用歌词中的时间标注信息进行进一步的分析: 例如发现歌曲中重复片段,节奏划分,检索等。初步的实验表明,我们的方法具有一定的效果。
Abstract
We report experiments on song lyrics based on natural language processing techniques. Song lyrics play an important role of the semantics in songs; therefore, analysis of lyrics may be a complement of acoustic methods. We investigate the lyrics corpus based on Zip’f Law using both character and word as a unit, which proves the validness Zip’f Law in such corpus. Also, we find a set of lyrics that are similar to each other by means of vector space mo-del. Moreover, we discuss how to use the time annotation for further analysis; detecting the repetition of songs identifying rhythms, retrieving songs and soon. Preliminary experiment shows the effectiveness of our proposed method.
关键词
计算机应用 /
中文信息处理 /
歌词 /
齐夫定律 /
k-近邻 /
节奏
{{custom_keyword}} /
Key words
computer application /
Chinese information processing /
song lyrics /
zipf’s law /
k-NN /
rhythm
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
参考文献
[1] S. Baumann, A. Kluter. Super-Convenience for Non-Musicians: Querying mp3 and the Semantic Web [A]. In: Proceedings of the 3rd International Conference on Music Information Retrieval [C]. 2002, 157-163.
[2] A. Berenzweig, B. Logan, D. Ellis and B. Whitman, A large-scale evaluation of acoustic and subjective music similarity measures [J]. Computer Music Journal. 2003, 28(2), 63-76.
[3] B. Logan, D. Ellis and A. Berenzweig, Towards Evaluation Techniques for Music Similarity [A]. In: International Conference. on Multimedia and Expo[C]. 2003.
[4] J. P. G. Mahedro, A. Martinez, P. Cano, M. Koppenberger and F. Gouyon, Natural language processing of lyrics [A]. In: International Conference. on Multimedia[C]. 2005. 475-478.
[5] B. Logan, A. Kositsky and P. Moreno, Semantic analysis of song lyrics [A], In: International Conference. on Multimedia[C]. 2004. 827-830.
[6] G.K.Zipf, Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology [M] .Cambridge,Mass Addison-Wesley Press,INC,1949.
[7] G. Salton, A.Wong. A vector space model for automatic indexing[J]. Communications of the ACM, 1975, 18(11): 613-620.
[8] 王煜, 王正欧,白石. 用于文本分类的改进KNN算法[J].中文信息学报,2007,21 (3):76-82.
[9] Tom Mitchell. Machine Learning[M]. McGraw Hill Press, 1996.
[10] D. Bainbridge, S.J. Cunningham and J.S. Downie, Analysis of queries to a Wizard-of-Oz MIR system: Challenging assumptions about what people really want [A]. In: International Conference. on Multimedia and Expo[C]. 2003.
[11] M. Besson, F. Faita, I. Peretz, A.-M. Bonnel, and J. Requin, Singing in the brain: Independence of Lyrics and Tunes [J]. Psychological Science, 1998, 6(9), 494-498.
[12] S. Scott and S.Matwin. Text Classification usingWordNet Hypernyms [A]. In: Use of WordNet in Natural Language Processing Systems[C]. 1998, 45-51.
[13] Y. Yang and X. Liu. A re-examination of text categorization methods[A]. In: Proceedings of the 22nd Annual International ACM SIGIR Conference[C]. 1999, 42-49.
[14] M. Bobrek, D.B. Koch, Music signal segmentation using tree-structured filter banks [J]. Audio Eng. Soc. 1998, 46 (5), 412-427.
{{custom_fnGroup.title_cn}}
脚注
{{custom_fn.content}}
基金
国家自然科学基金资助项目(60573187, 60621062,60520130299)
{{custom_fund}}