COM6511 Speech Technology

School of Computer Science

COM6511 Speech Technology

Summary	This module introduces the principles of the emergent field of speech technology, studies typical applications of these principles and assesses the state of the art in this area. You will learn the prevailing techniques of automatic speech recognition (based on statistical modelling); will see how speech synthesis and text-to-speech methods are deployed in spoken language systems; and will discuss the current limitations of such devices. The module will include project work involving the implementation and assessment of a speech technology device.
Session	Spring 2024/25
Credits	15
Assessment	Blackboard quizzes (threshold and graded) Practical work (graded)
Lecturer(s)	Prof. Thomas Hain & Dr Anton Ragni
Resources	Blackboard Unconfirmed practical marks when available Exam Papers, past 2 years (where applicable)
Aims	to teach the principles and application of speech technology, covering speech recognition and synthesis to provide experience in building and using speech technology devices.
Learning Outcomes	By the end of this course the students should: appreciate the difficulties of machine perception in general and speech perception in particular; understand the different types of speech tech in use today understand the prevailing techniques for modelling speech in automatic speech recognition; see how these techniques are deployed in spoken language systems; appreciate the difficulties of producing synthetic speech and understand the principles of speech synthesisers and text-to-speech systems; and have experience in implementing and assessing a speech technology device.
Content	Introduction to speech technology Pattern processing fundamentals for speech Hidden Markov Models and Deep Neural Networks for speech processing Towards a state-of-the-art ASR System Acoustic modelling Language modelling Search Adaptation Speech recognition application examples Speaker identification Speech synthesis Spoken dialogue systems
Restrictions	Students must have taken COM6502 in the previous semester Optional modules within the department have limited capacity. We will always try to accommodate all students but cannot guarantee a place.
Teaching Method	Two formal lectures per week, for 10 weeks; and a one hour practical session per week, for 6 weeks. Practical work will consist of a project involving the implementation and assessment of a speech technology device. This will include some programming.
Feedback	Students will receive feedback in the weekly practical sessions.