Gerardo Roa Dabike MSc Dissertation 2015/16

School of Computer Science

Gerardo Roa Dabike MSc Dissertation 2015/16

Automatic Speech Recognition in Music

Supervised by J.Barker

Abstract

Automatic Speech Recognition in music is a barely analysed problem which can be beneficial in creative and retail business applications. This project is aimed to experiment in a musical corpus with synthetic augmented training data and with DNN methodologies in order to determine if these approaches can improve the performance of recognizer. Previous researches used speaker to singer adaptation rather than training in a singing database. First, creating a novel corpus ACOMUS1 based in acoustic cover music and then several audio augmentations experiments will be conducted in order to try to reach a high performance and obtain information that would lead to further researches. The experiment results obtained a poor performance reaching a 86% of WER using DNN sMBR nevertheless, ACOMUS1 corpus showed to have a high growth potential that would allow more researches on it.