The University of Sheffield
Department of Computer Science

Thomas Ashford Undergraduate Dissertation 2000/01

"Computerised Determination of Disputed Authorship"

Supervised by Y.Wilks

Abstract

The problem of attributing authorship to pieces of text was known long before the computer was invented, and has only recently become of interest to natural language processing based researchers as the power of computation now allows large corpora of text to be analysed in a short period of time. The majority of authorship disputes are over classical pieces of literature, but are also found in modern, especially forensic, cases. There are various automated stylometric methods of attributing authorship to texts, all of which are examined in this thesis; but one particular method, the cusum method, is focused on and a detailed examination and computerised implementation is undertaken.

The cusum method is then tested on a popular authorship debate: Henry VIII by Shakespeare. A number of possible authors are tested and detailed results are produced, which although not being conclusive in identifying the authorship of the play, lead to a conclusion that raises questions over the validity of the method and of the automated authorship attribution field as a whole.

Related Topics:

  • The Cusum Method
  • Authorship Attribution
  • Stylometry