Summary |
This module introduces the principles of the emergent
field of speech technology, studies typical applications of
these principles and assesses the state of the art in this
area. Students will learn the prevailing techniques of
automatic speech recognition (based on statistical
modelling); will see how speech synthesis and text-to-speech
methods are deployed in spoken language systems; and will
discuss the current limitations of such devices. The module
will include project work involving the implementation and
assessment of a speech technology device. |
Session |
Spring 2024/25 |
Credits |
15 |
Assessment |
- Blackboard quizzes (threshold and graded)
- Practical work (graded)
|
Lecturer(s) |
Prof. Thomas Hain & Dr Anton Ragni |
Resources |
|
Aims |
- to teach the principles and application of speech
technology, covering speech recognition and synthesis
- to provide experience in building and using speech
technology devices.
|
Learning Outcomes |
By the end of this course the students should:
- appreciate the difficulties of machine perception in
general and speech perception in particular;
- understand the different types of speech tech in use today
- understand the prevailing techniques for modelling speech in automatic speech
recognition;
- see how these techniques are deployed in spoken
language systems;
- appreciate the difficulties of producing synthetic
speech and understand the principles of speech
synthesisers and text-to-speech systems; and
- have experience in implementing and assessing a speech
technology device.
|
Content |
- Introduction to speech technology
- Pattern processing fundamentals for speech
- Hidden Markov Models and Deep Neural Networks for speech processing
- Towards a state-of-the-art ASR System
- Acoustic modelling
- Language modelling
- Search
- Adaptation
- Speech recognition application examples
- Speaker identification
- Speech synthesis
- Spoken dialogue systems
|
Restrictions |
This module is only open to students who have taken COM3502 or COM4502.
Optional modules within the department have limited capacity. We will always try to accommodate all students but cannot guarantee a place. |
Teaching Method |
Two formal lectures per week, for 10 weeks; and a one hour practical session per week, for 6 weeks.
Practical work will consist of a project involving the
implementation and assessment of a speech technology device.
This will include some programming. |
Feedback |
Students will receive feedback in the weekly practical
sessions. |