The University of Sheffield
Department of Computer Science

Nathan Mccoy MSc Dissertation 2015/16

Joint Multiword Expression and Supersense Tagging with Recurrent Neural Networks and Conditional Random Fields

Supervised by A.Vlachos

Abstract

Understanding human language is a difficult task, with varied fields of study which aim at explaining and researching the human language faculty. Linguistics, Psychology and Computer Science all use domain specific tools to describe and model language. Natural Language Processing is the field which aims at using computational mechanisms to process naturally occurring human language. Modeling syntax gives language structure, but how do we model meaning? Using general sense classes, or “ supersenses ” we can potentially enrich texts with semantic information.Given a sentence with syntactic information, and a closed set of semantic supersenses , can a supersense tagged sentence be derived? Furthermore, can we demarcate boundaries for Multiword Expressions?The goal of this project is to create a Multiword Expression boundary and Supersense labelled sentence by training with Word, Part-Of-Speech, Multiword Expression and Supersense tagged training data. The semantically tagged sentences can be used for many tasks such as Question Answering systems, Information Retrieval, Discourse and Sentiment analysis.