See my biography here
I have over thirty-five years experience in speech technology R&D, and have worked in government, industry and academia. Much of my research has been based on insights derived from human speech perception and production. In the 1970s, I introduced the 'Human Equivalent Noise Ratio' (HENR) as a vocabulary-independent measure of the goodness of an automatic speech recogniser based on a computational model of human word recognition. In the 1980s, I published 'HMM Decomposition' - a powerful method for recognising multiple simultaneous signals (such as speech in noise) - based on observed properties of the human auditory system. During the 1990s and more recently, I've continued to champion the need to understand the similarities and differences between human and machine spoken language behaviour.
Since joining the Speech and Hearing research group at Sheffield I've embarked on research that is aimed at developing computational models of spoken language processing by mind and machine, and I'm currently working on a unified theory of spoken language processing in the general area of 'Cognitive Informatics' called 'PRESENCE' (PREdictive SENsorimotor Control and Emulation). PRESENCE weaves together accounts from a wide variety of different disciplines concerned with the behaviour of living systems - many of them outside the normal realms of spoken language - and compiles them into a new framework that is intended to breath life into a new generation of research into spoken language processing by mind and machine including human-robot interaction.
At the more applied end of research, I'm a member of the Sheffield Clinical Applications of Speech Technology (CAST) multidisciplinary research team, and I'm also becoming increasingly involved in creative applications of speech technology through interactions with colleagues from the performing arts (in particular Dr. Chris Newell, John Avery, Jocelyn Cammack and Dr. Cathy Lane, James Wilkes and Anna Barham). I am a founding member of the Sheffield Centre for Robotics (SCentRo).
In pursuing PRESENCE-based approaches to modeling spoken language, I've become increasingly drawn into studying vocalisation in general, whether it is performed by human beings, animals or robots. I'm currently developing synthesisers for mammalian, insect and dolphin vocalisations, and embedding them in behavioural simulations implemented in Pure Data and in real-time embodiments using e-puck and Create robots. I'm also working with Dr. Peter Wallis on vocal imitation using a musical instrument. My aim is to demonstrate that many of the little-understood para-linguistic features exhibited in human speech (including emotion) are derived from characteristics that are shared by living systems in general. Modelling such behaviours in this wider context should eventually enable us to implement usable and effective interaction with artificial intentional agents such as robots.
In order to provide a focus for this line of research, I've recently set up VILab - the 'Vocal Interactivity Laboratory'.
I'm very interested in collaborating with other colleagues and groups in any areas relating to vocal interactivty, ranging from the integration of state-of-the-art speech technology (recognition and synthesis) into embodied systems, to fundamental research into emergent coactive/communicative behaviour and vocal afordances.
I'm currently supervising the following PhD research students:
... and recently successful PhD students:
I've also acted as advisor to the following succesful PhD research students:
Dr. Robert Kirchner from the Linguistics Department at the University of Alberta spent a sabbatical in the group in 2007/8. Robert is concerned with the extent to which phonological patterns and alternations can be accounted for in terms of an interplay of phonetic (articulatory and perceptual) considerations.
Dr. Odette Sharenborg from the University of Nijmegen visited for 11 months in 2006/7 on a post-doctoral fellowship. Odette worked on 'modelling the influence of subphonemic cues on lexical activation in human speech recognition using techniques from automatic speech recognition'.
Dr. Simon Worgan was an EPSRC/University Doctoral Prize Fellow during 2009/10 working on combining double-weak direct realism with the PRESENCE architecture in order to model speech as an intentional, unbroken cycle of production and perception.
Note to potential project/dissertation students: I'm happy to consider supervising your own well thought out project in spoken language processing or vocal interactivity. Contact me if you have an idea you'd like to develop.
To see what I get up to in my (limited) spare time, have a look here and here, and you can check the current weather in Castleton, Derbyshire here (powered by a Raspberry pi).
Or you can follow me on twitter: