The University of Sheffield
Department of Computer Science

Hugo Young Undergraduate Dissertation 2015/16

The Domain Specificity of Sentiment Lexicons: an Investigation

Supervised by M.Hepple

Abstract

Sentiment analysis is a field of great interest to NLP researchers, to businesses and to politicians. Over recent years, the explosion of social media and Web 2.0 platforms have made available vast quantities of data that make sentiment analysis both more feasible and desirable as manual methods become insufficient. It is however greatly domain dependent. Tailoring sentiment analysis applications for specific domains by hand is a time consume task.

This paper aims to investigate and evaluate a method for the unsupervised induction of sentiment lexicons from large amounts of domain separated data. We also aim to determine to evaluate the performance differences between general and domain specific lexicons, if indeed there are any, at unsupervised lexical sentiment analysis.

The dataset used in this project is comprised of every publicly available Reddit comment from October 2007 to May 2015.