The University of Sheffield
Department of Computer Science

Christopher Murray Undergraduate Dissertation 2005/06

"A Java Tool for Transformation-Based Learning"

Supervised by Dr MR Hepple

Abstract

Transformation-Based Learning (TBL) is a rule-based machine learning technique suitable for application to a range of classification tasks. The approach involves the generation of a sequence of context-sensitive "transformation rules" which may be used to correct errors in the initial tagging of a corpus to improve the accuracy of the classification. Application of the technique typically requires the production of a custom implementation adapted for the specific task.

The objective of this project is to develop a Java implementation of a generic TBL tool that will support research into the use of TBL for various tasks in NLP and other fields. The tool must enable application of the technique to any task without requiring significant adaptation. The tool must also provide the flexibility to modify and plug in different versions of its functional components to enable further development. The primary aim of this project is not research of transformation-based learning, but rather the development of a tool that will support such research.

This project begins with an investigation of the requirements of various tasks to which TBL has been applied, and an examination of the features provided by existing generic TBL tools, which the intention of identifying the potential requirements of a generic TBL tool. The tool is then specified, design, implemented and tested. Finally, the results of this project, primarily the eXtensible Transformation-Based Learning (XTBL) tool, are presented with critical discussion and proposals for further development.