The University of Sheffield
Department of Computer Science

Jack Deadman Undergraduate Dissertation 2016/17

Sound Event Detection in Real Life Audio

Supervised by J.Barker

Abstract

The real world is full of sound events, cars passing, glass breaking, birds tweeting. This report presents a polyphonic sound event detection system to detect these events in real life audio. The system is compared against the state of the art techniques used in the 2016 DCASE challenge, this is achieved through using 1 second segment-based performance metrics using Error Rate and F-Score. The project aimed to experiment with novel feature extraction and data augmentation techniques to find where the performance benefits can be achieved. By introducing novel feature extraction and data augmentation techniques the sound event detection system successfully showed a considerable improvement over the baseline system on a completely new dataset, achieving an Error Rate of 0.76, compared to the higher Error Rate of 0.96 achieved by the baseline system.