Thomas Hain

Professor of Speech and Audio Technology

Machine Intelligence for Natural Interfaces
Speech and Hearing Research Group
Department of Computer Science

List of Publications
By year By citations By Type
Google citation coutns are automatically retrieved. The link targets may exhibit slightly different counts as Google ciftations may go up or even sometimes down.
2012

[1]  Al-Shareef, S. & Hain, T.(2012). " CRF-based Diacritisation of Colloquial Arabic for Automatic Speech Recognition " In Interspeech'12

[2]  Christensen, H., Cunningham, S., Fox, C., Green, P. & Hain, T.(2012). " A comparative study of adaptive, automatic recognition of disordered speech " In Interspeech'12

[3]  Fox, C., Christensen, H. & Hain, T.(2012). " Studio report: Linux audio for multi-speaker natural speech technology " In Linux Audio Conference

[4]  Gibson, M. & Hain, T.(2012). " APPLICATION OF SVM-BASED CORRECTNESS PREDICTIONS TO UNSUPERVISED DISCRIMINATIVE SPEAKER ADAPTATION " In ICASSP'12

[5]  Gibson, M. & Hain, T.(2012). " Correctness-adjusted unsupervised discriminative acoustic model adaptation " In IEEE Transactions on Audio, Speech and Language Processing, Vol.

[6]  Hain, T. & Garner, P. N.(2012). "Speech Recognition" In The AMIDA Book, , Cambridge University Press

[7]  Kamper, H., de Wet, F., Hain, T. & Niesler, T.(2012). " RESOURCE DEVELOPMENT AND EXPERIMENTS IN AUTOMATIC SOUTH AFRICAN BROADCAST NEWS TRANSCRIPTION " In SLTU'12

[8]  Lecorvé, G., Dines, J., Hain, T. & Motlicek, P.(2012). " Supervised and unsupervised Web-based language model domain adaptation " In Interspeech'12

[9]  Lecorvé, G., Dines, J., Hain, T. & Motlicek, P.(2012). " Impact du degré de supervision sur l'adaptation à un domaine d'un modèle de langage à partir du Web " In JEP'12

[10]  Ng, R. W. M., Hain, T. & Hirose, K.(2012). " An alignment matching method to explore pseudosyllable properties across different corpora " In Interspeech'12

2011

[11]  Al-Shareef, S. & Hain, T.(2011). " An Investigation in Speech Recognition for Colloquial Arabic " In Interspeech'11

[12]  Gibson, M. & Hain, T.(2011). " Confidence-informed unsupervised Minimum Bayes Risk acoustic model adaptation "

[13]  Hain, T., Burget, L., Dines, J., Garner, P. N., Grezl, F., el Hannani, A., Huijbregts, M., Karafiat, M., Lincoln, M. & Wan, V.(2011). "Transcribing meetings with the AMIDA systems" In IEEE Transactions on Audio, Speech and Language Processing, Vol.

[14]  el Hannani, A. & Hain, T.(2011). " Data Dependence of Speech Decoder Parameters "

[15]  Kempton, T., Moore, R. K. & Hain, T.(2011). "Cross-language phone recognition when the target language phoneme inventory is not known" In Interspeech'11

[16]  Marino, D. & Hain, T.(2011). "An Analysis of Automatic Speech Recognition with Multiple Microphones" In Interspeech'11

[17]  Tucker, R., Fry, D., Wan, V., Wrigley, S. & Hain, T.(2011). " Extending Audio Notetaker to Browse WebASR Transcriptions " In Interspeech'11

[18]  Wrigley, S. N. & Hain, T.(2011). "Web-based automatic speech recognition service - webASR" In Interspeech'11

[19]  Wrigley, S. N. & Hain, T.(2011). "Making an automatic speech recognition service freely available on the web" In Interspeech'11

2010

[20]  Gibson, M. & Hain, T.(2010). "Error Approximation and Minimum Phone Error Acoustic Model Estimation" In IEEE Transactions on Audio, Speech and Language Processing, Vol. 18 ,  pp. 1269-1279

[21]  Hain, T., Burget, L., Dines, J., Garner, P. N., el Hannani, A., Huijbregts, M., Karafiat, M., Lincoln, M. & Wan, V.(2010). "The AMIDA 2009 Meeting Transcription System" In Interspeech'10 ,  pp. 358-361

[22]  Hain, T. & Renals, S.(2010). " Meeting Recognition " In Tutorial interspeech 2010

[23]  el Hannani, A. & Hain, T.(2010). "Automatic Optimization of Speech Decoder Parameters" In IEEE Signal Processing Letters, Vol. 17 ,  pp. 95-98

[24]  Renals, S. & Hain, T.(2010). "Speech Recognition" In The Handbook of Computational Linguistics and Natural Language Processing, , Wiley-Blackwell ,  pp. 299-332

2009

[25]  Garner, P. N., Dines, J., Hain, T., el Hannani, A., Karafiat, M., Korchagin, D., Lincoln, M., Wan, V. & Zhang, L.(2009). "Real-Time ASR from Meetings" In Interspeech'09 ,  pp. 2119-2122

2008

[26]  Hain, T.(2008). " The Careful Listener: Speech Processing in Meetings " In Keynote PRASA'08

[27]  Hain, T., Burget, L., Dines, J., Garau, G., Karafiat, M., Lincoln, M., van Leeuwen, D. & Wan, V.(2008). "The 2007 AMI(DA) System for Meeting Transcription" In NIST Rich Transcription 2007, Lecture Notes in Computer Science, , Springer ,  pp. 414-428

[28]  Hain, T., el Hannani, A., Wrigley, S. & Wan, V.(2008). "Automatic speech recognition for scientific purposes - webASR" In Interspeech'08 ,  pp. 504-507

[29]  Karafiat, M., Burget, L., Hain, T. & Cernocky, J.(2008). "Discrimininative training of narrow band-wide band adapted systems for meeting recognition" In Interspeech'08 ,  pp. 1217-1220

[30]  Renals, S., Hain, T. & Bourlard, H.(2008). "Interpretation of Multiparty Meetings: the AMI and AMIDA Projects" In HSCMA'08

[31]  Wan, V., Dines, J., el Hannani, A. & Hain, T.(2008). "Bob: A lexicon and pronunciation dictionary generator" In SLT'08 ,  pp. 217

2007

[32]  Gibson, M. & Hain, T.(2007). "Temporal Masking for Unsupervised Minimum Bayes Risk Speaker Adaptation" In Interspeech'07

[33]  Hain, T.(2007). " The AMI Meeting Transcription System " In Seminar CUED

[34]  Hain, T., Burget, L., Dines, J., Garau, G., Karafiat, M., Lincoln, M., Wan, V. & Vepa, J.(2007). "The AMI System for the Transcription of Speech in Meetings" In ICASSP'07 ,  pp. 357-360

[35]  Karafiat, M., Burget, L., Hain, T. & Cernocky, J.(2007). "Application of CMLLR in narrow band wide band adapted systems" In Interspeech'07 ,  pp. 282-285

[36]  Renals, S., Hain, T. & Bourlard, H.(2007). "Recognition and Understanding of Meetings: The AMI and AMIDA Projects" In ASRU'07 ,  pp. 238-247

2006

[37]  Al-Hames, M., Hain, T., Cernocky, J., Schreiber, S., Poel, M., Muller, R., Marcel, S., van Leeuwen, D., Odobez, J.-M., Ba, S., Bourlard, H., Cardinaux, F., Gatica-Perez, D., Janin, A., Motlicek, P., Renals, S., van Rest, J., Rienks, R., Rigoll, G., Smith, K., Thean, A. & Zemcik, P.(2006). "Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers" In Machine Learning for Multimodal Interaction, Lecture Notes in Computer Science, , Springer ,  pp. 24-35

[38]  Dines, J., Vepa, J. & Hain, T.(2006). "The segmentation of multi-channel meeting recordings for automatic speech recognition" In Interspeech'06

[39]  Gibson, M. & Hain, T.(2006). "Hypothesis Spaces For Minimum Bayes Risk Training In Large Vocabulary Speech Recognition" In Interspeech'06

[40]  Hain, T., Burget, L., Dines, J., Garau, G., Karafiat, M., Lincoln, M., Vepa, J. & Wan, V.(2006). "The AMI Meeting Transcription System : Progress and Performance" In NIST Rich Transcription 2006, Lecture Notes in Computer Science, , Springer ,  pp. 419-431

[41]  Hain, T., Dines, J. & McCowan, I.(2006). " Conversational multi-party speech recognition using remote microphones "

[42]  Moore, D., Dines, J., Doss, M. M., Vepa, J., Cheng, O. & Hain, T.(2006). "Juicer: A weighted finite state transducer speech decoder" In Machine Learning for Multimodal Interaction, Lecture Notes in Computer Science, , Springer ,  pp. 285-296

[43]  Uraga, E. & Hain, T.(2006). "Automatic Speech Recognition Experiments with Articulatory Data" In Interspeech'06

[44]  Wan, V. & Hain, T.(2006). "Strategies for Language Model Web-data Collection" In ICASSP'06

2005

[45]  Carletta, J., Ashby, S., Bourban, S., Guillemot, M., Kronenthal, M., Lathoud, G., Lincoln, M., McCowan, I., Hain, T., Kraaij, W., Post, W., Kadlec, J., Wellner, P., Flynn, M. & Reidsma, D.(2005). "The AMI Meeting Corpus: A Pre-announcement" In Machine Learning for Multimodal Interaction, Lecture Notes in Computer Science, , Springer ,  pp. 28-39

[46]  Garau, G., Renals, S. & Hain, T.(2005). "Applying vocal tract length normalization to meeting recordings" In Interspeech'05

[47]  Hain, T.(2005). "Implicit modelling of pronunciation variation in automatic speech recognition" In Speech Communication, Vol. 46 ,  pp. 171-188

[48]  Hain, T., Burget, L., Dines, J., Garau, G., Karafiat, M., Lincoln, M., McCowan, I., Moore, D., Wan, V., Ordelman, R. & Renals, S.(2005). "The Development of the AMI System for the Transcription of Speech in Meetings" In Machine Learning for Multimodal Interaction, Lecture Notes in Computer Science, , Springer ,  pp. 344-356

[49]  Hain, T., Burget, L., Dines, J., Garau, G., Karafiat, M., Lincoln, M., McCowan, I., Moore, D., Wan, V., Ordelman, R. & Renals, S.(2005). "The 2005 AMI System for the Transcription of Speech in Meetings" In NIST Rich Transcription 2005, Lecture Notes in Computer Science, , Springer ,  pp. 450-462

[50]  Hain, T., Dines, J., Garau, G., Karafiat, M., Moore, D., Wan, V., Ordelman, R. & Renals, S.(2005). "Transcription of Conference Room Meetings: an Investigation" In Interspeech'05 ,  pp. 1661-1664

[51]  Hain, T., Woodland, P. C., Evermann, G., Gales, M. J. F., Liu, X., Moore, G. L., Povey, D. & Wang, L.(2005). "Automatic transcription of conversational telephone speech" In IEEE Transactions on Speech and Audio Processing, Vol. 13 ,  pp. 1173-1185

[52]  McCowan, I., Carletta, J., Kraaij, W., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kronenthal, M., Lathoud, G., Lincoln, M., Lisowska, A., Post, W., Reidsma, D. & Wellner, P.(2005). " The AMI Meeting Corpus " In 5th International Conference on Methods and Techniques in Behavioral Research

2004

[53]  Evermann, G., Chan, H. Y., Gales, M. J. F., Hain, T., Liu, X., Mrva, D., Wang, L. & Woodland, P. C.(2004). "Development of the 2003 CU-HTK conversational telephone speech transcription system" In ICASSP'04 ,  pp. 249-252

[54]  Kim, D. Y., Gales, M. J. F., Chan, H. Y., Woodland, P. C., Umesh, S. & Hain, T.(2004). " Progress in Broadcast News English Transcription " In EARS STT Technical Meeting 2004

[55]  Kim, D. Y., Umesh, S., Gales, M. J. F., Hain, T. & Woodland, P. C.(2004). "Using VTLN for Broadcast News Transcription" In ICSLP'04 ,  pp. 1953-1956

[56]  Woodland, P. C., Chan, H. Y., Evermann, G., Gales, M. J. F., Hain, T., Jia, B., Kim, D. Y., Liu, X., Mrva, D., Sim, K. C., Tranter, S. E. & Wang, L.(2004). " Cambridge STT Overview " In EARS Mid-year Meeting 2004

[57]  Young, S. J., Evermann, G., Gales, M. J. F., Hain, T., Kershaw, D., Moore, G. L., Odell, J. J., Ollason, D., Povey, D., Valtchev, V. & Woodland, P. C.(2004). "The HTK Book (from version 3.3)"

2003

[58]  Hain, T.(2003). " Single Pronunciation Dictionaries - Construction and Performance " In EARS STT Technical Meeting 2004

[59]  Hain, T., Woodland, P. C., Evermann, G., Liu, X., Moore, G. L., Povey, D. & Wang, L.(2003). " Automatic Transcription of Conversational Telephone Speech. Development of the CU-HTK 2002 System "

[60]  Jia, B., Sim, K. C., Gales, M. J. F., Hain, T., Liu, X., Woodland, P. C. & Yu, K.(2003). " CU-HTK RT-03 Mandarin CTS System " In Rich Transcription Workshop 2003

[61]  Kim, D. Y., Evermann, G., Hain, T., Mrva, D., Tranter, S. E., Wang, L. & Woodland, P. C.(2003). "Recent Advances in Broadcast News Transcription" In ASRU'03

[62]  Kim, D. Y., Evermann, G., Hain, T., Mrva, D., Tranter, S. E., Wang, L. & Woodland, P. C.(2003). " 2003 CU-HTK Broadcast News English System Development " In Rich Transcription Workshop 2003s

[63]  Woodland, P. C., Chan, H. Y., Evermann, G., Gales, M. J. F., Hain, T., Kim, D. Y., Liu, X., Mrva, D., Povey, D., Tranter, S. E., Wang, L. & Yu, K.(2003). " 2003 CU-HTK English CTS Systems " In Rich Transcription Workshop 2003s

[64]  Woodland, P. C., Evermann, G., Gales, M. J. F., Hain, T., Chan, H. Y., Jia, B., Kim, D. Y., Liu, X., Mrva, D., Povey, D., Sim, K. C., Tomalin, M., Tranter, S. E., Wang, L. & Yu, K.(2003). " Recent Experiments with HTK Broadcast News and Conversational Telephone Systems " In EARS Mid-year meeting 2003

2002

[65]  Hain, T.(2002). "Implicit Pronunciation Modelling in ASR" In ITRW PMLA 2002

[66]  Woodland, P. C., Evermann, G., Gales, M. J. F., Hain, T., Liu, X., Moore, G. L., Povey, D. & Wang, L.(2002). " CU-HTK APRIL 2002 SWITCHBOARD SYSTEM " In Rich Transcription Workshop 2002

2001

[67]  Hain, T.(2001). " Hidden Model Sequence Models for Automatic Speech Recognition " PhD Thesis, Cambridge University

[68]  Hain, T., Woodland, P. C., Evermann, G. & Povey, D.(2001). "New features in the CU-HTK system for transcription of conversational telephone speech" In ICASSP'01 ,  pp. 57-60

2000

[69]  Hain, T. & Woodland, P. C.(2000). "Modelling sub-phone insertions and deletions in continuous speech recognition" In ICSLP 2000

[70]  Hain, T., Woodland, P. C., Evermann, G. & Povey, D.(2000). "The CU-HTK March 2000 HUB5E Transcription System" In Speech Transcription Workshop 2000

1999

[71]  Hain, T. & Woodland, P. C.(1999). "Dynamic HMM selection for continuous speech recognition" In Eurospeech'99 ,  pp. 1327-1330

[72]  Hain, T. & Woodland, P. C.(1999). " RECENT EXPERIMENTS WITH THE CU-HTK HUB5 SYSTEM " In Hub5 Workshop'99

[73]  Hain, T. & Woodland, P. C.(1999). " Hidden model sequences " In Hub5 Workshop'99

[74]  Hain, T., Woodland, P. C., Niesler, T. R. & Whittaker, E. W. D.(1999). "The 1998 HTK system for transcription of conversational telephone speech" In ICASSP'99 ,  pp. 57-60

[75]  Odell, J. J., Woodland, P. C. & Hain, T.(1999). "The CUHTK-Entropic 10xRT Broadcast News Transcription System" In 1999 DARPA Broadcast News Transcription and Understanding Workshop ,  pp. 271-275

[76]  Woodland, P. C., Hain, T., Moore, G. L., Niesler, T. R., Povey, D., Tuerk, A. & Whittaker, E. W. D.(1999). "The 1998 HTK Broadcast News Transcription System: Development and Results" In 1999 DARPA Broadcast News Transcription and Understanding Workshop

[77]  Woodland, P. C., Odell, J. J., Hain, T., Moore, G. L., Niesler, T. R., Tuerk, A. & Whittaker, E. W. D.(1999). "Improvements in Accuracy and Speed in the HTK Broadcast News Transcription System" In Eurospeech'99

1998

[78]  Hain, T., Johnson, S. E., Tuerk, A., Woodland, P. C. & Young, S. J.(1998). "Segment Generation and Clustering in the HTK Broadcast News Transcription System" In 1998 DARPA Broadcast News Transcription and Understanding Workshop ,  pp. 133-137

[79]  Hain, T. & Woodland, P. C.(1998). " CU-HTK Acoustic modeling experiments " In Hub5 Workshop 98

[80]  Hain, T. & Woodland, P. C.(1998). "SEGMENTATION AND CLASSIFICATION OF BROADCAST NEWS AUDIO" In ICSLP'98

[81]  Woodland, P. C., Hain, T., Johnson, S. E., Niesler, T. R., Tuerk, A., Whittaker, E. W. D. & Young, S. J.(1998). "The 1997 HTK Broadcast News Transcription System" In 1998 DARPA Broadcast News Transcription and Understanding Workshop ,  pp. 41-48

[82]  Woodland, P. C., Hain, T., Johnson, S. E., Niesler, T. R., Tuerk, A. & Young, S. J.(1998). "Experiments in Broadcast News Transcription" In ICASSP'98 ,  pp. 909-912

Before 1998

[83]  Hain, T.(1993). " On the Use of Iterated Function Systems for Coding of Grayscale Images " PhD Thesis, Univ. of Technology Vienna

[84]  Huertgen, B. & Hain, T.(1994). "On the convergence of fractal transforms" In ICASSP'94 ,  pp. 561-564