The University of Sheffield
Department of Computer Science

Xu Shao MSc Dissertation 2000/01

"Reconstructing Occluded Speech"

Supervised by P.Green

Abstract

Considerable achievements have been made during past years in Automatic Speech Recognition (ASR), as commercial ASR softwares were developed. These batches of software make it easier to communicate between computer-based device and human beings. But there still are some constraints in the ASR, especially in the noisy environment.

Some ideas about robust speech data and how to handle the missing data have been provided by researchers, such as "Auditory Scene Analysis"[1], Mean-imputation based method [3,4] and Marginalization-based method [3,4]. Bhisksha Raj et al. introduced the geometric-based algorithm, cluster-based algorithm and correlation-based inference method to handle the missing spectrographic features.

This project utilized algorithms to reconstruct the missing data by using idea from Raj et al. and to recognise the reconstructed data in a normal way. This project tested these algorithms in the random deletion data environment as well as in the real noise data environment.

The geometric-based algorithm improves the Recognition Accuracy Rate (RAR) in random deletion data environment. But it cannot improve the RAR in real noise environment. The cluster-based algorithm can improve the RAR more or less in both environments according to the codebook size and classification algorithms.