The University of Sheffield
Department of Computer Science

Meshael Sultan MSc Dissertation 2005/06

"Multiple Choice Question Answering;"

Supervised by Dr L Guthrie

Abstract

People rely on information for performing the activities of their daily lives. Much of this information can be found quickly and accurately online through the World Wide Web. As the amount of online information grows, better techniques to access the information are necessary. Techniques for answering specific questions are in high demand and large research programmes investigating methods for question answering have been developed to respond to this need.

Question Answering (QA) technology; however, faces some problems, thus inhibiting its advancement. Typical approaches will first generate many candidate answers for each question, then attempt to select the correct answer from the set of potential answers. The techniques for selection of the correct answer are in their infancy, and further techniques are needed to decide and to select the correct answer from candidate answers.

This project focuses on multiple choice questions and the development of techniques for automatically finding the correct answer. In addition to being a novel and interesting problem on its own, the investigation has identified methods for web based Question Answering (QA) technology in selecting the correct answers from potential candidate answers. The project has investigated techniques performed manually and automatically. The data consists of 600 questions, which were collected from an online web resource. They are classified into 6 categories, depending on the questions' domain, and divided equally between the investigation and the evaluation stages. The manual experiments were promising, as 45 percent of the answers are correct, which increased to 95 percent after the form of the queries was restructured. Automatic techniques, such as using quotation marks, and replacing the question words according to the question type it was found that the accuracy ranged between 48.5 and 49 percent. The accuracy had also increased to 63 percent and 74 percent in some categories, such as geography and literature.