The University of Sheffield
Department of Computer Science

Gavin Bones Undergraduate Dissertation 2005/06

"Computerised Determination of Disputed Authorship"

Supervised by Dr MR Hepple

Abstract

Automatic authorship attribution has a number of real world uses, from resolution of disputes over the origin of historical literature, detecting plagiarism, interception of hidden messages in correspondence as well as in the field of forensic linguistics.

This project aims to explore techniques for the automatic attribution of text authorship by machine. It investigates existing text classification methods with a view to building a simple and reliable system to categorize documents into a number of known classes by author. For the construction and training of a text classifier, machine learning concepts will also be employed.

The material to which this will be applied will be popular works of fiction taken from the catalogue of Project Gutenberg.