【主讲】刘耀强,副教授,香港城市大学
【题目】构建特定领域情感词典的伪标注方法及金融应用
【时间】2013-4-11(周四)10:30-11:30
【地点】清华经管学院伟伦楼453
【语言】英语
【主办】管理科学与工程系
【摘要】Sentiment lexicons have been widely used for sentiment analysis. However, manually constructing domain-specific sentiment lexicons is extremely time consuming and it may not even be feasible for domains where linguistic expertise is not available. Research on the automatic construction of domain-specific sentiment lexicons has become a hot topic in recent years. In this presentation, the research work about our semi-supervised learning method which exploits the“distributional characteristic”of sentiments in labeled or unlabeled corpora for the construction of domain-specific sentiment lexicons will be discussed. More specifically, the proposed two-pass“pseudo labeling”algorithm combines shallow linguistic parsing and corpus-base statistical learning to make sentiment lexicon learning scalable with respect to the sheer volume of opinionated documents archived on the Internet these days. As subjectively assessed by human experts, the automatically constructed domain-specific sentiment lexicons are considered to have high quality. Based on an objective polarity prediction task at the document level, it is shown that our domain-specific sentiment lexicons outperform other well-known baseline methods. Finally, the applications of our domain-specific sentiment lexicons to financial prediction tasks are highlighted and the business implications of our research work are discussed.