GRID audiovisual corpus

The GRID audiovisual sentence corpus

What is GRID? | Examples | Downloading | Documentation | Credits


What is GRID?

GRID is a large multitalker audiovisual sentence corpus to support joint computational-behavioral studies in speech perception. In brief, the corpus consists of high-quality audio and video (facial) recordings of 1000 sentences spoken by each of 34 talkers (18 male, 16 female). Sentences are of the form "put red at G9 now".  The corpus, together with transcriptions, is freely available for research use. GRID is described in more detail in this paper.

Examples

talker audio only  video (normal)  video (high)  transcriptions
male download download download download
female download download download download

Downloading

Audio, video and other associated information such as word transcriptions are available separately for each talker.

Audio files were scaled on collection to have an absolute maximum amplitude value of 1 and downsampled to 25 kHz. These signals have been endpointed. In addition, the raw original 50 kHz signals are included below.

Video files are provided in two formats: normal quality (360x288; ~1kbit/s) and high quality (720x576; ~6kbit/s). Due to a technical oversight, video for speaker 21 is not available.

 talker  25 kHz endpointed audio
(about 100M each)
raw 50 kHz audio
(300M each)
 video (normal)
(480 M each)
 video (high, pt1)
(1.2 G each)
 video (high, pt2)
(1.2 G each)
 word alignments
(190 K each)
1 download download download download download download
2 download download download download download download
3 download download download download download download
4 download download download download download download
5 download download download download download download
6 download download download download download download
7 download download download download download download
8 download download download download download download
9 download download download download download download
10 download download download download download download
11 download download download download download download
12 download download download download download download
13 download download download download download download
14 download download download download download download
15 download download download download download download
16 download download download download download download
17 download download download download download download
18 download download download download download download
19 download download download download download download
20 download download download download download download
21 download download Oops! No video Oops! No video Oops! No video download
22 download download download download download download
23 download download download download download download
24 download download download download download download
25 download download download download download download
26 download download download download download download
27 download download download download download download
28 download download download download download download
29 download download download download download download
30 download download download download download download
31 download download download download download download
32 download download download download download download
33 download download download download download download
34 download download download download download download

Documentation

This paper describes the motivation for GRID and details of its collection. Behavioural results for subsets of the Grid Corpus are described in Barker and Cooke (2007) and Cooke et al. (2008). In addition, Grid sentences were used in the making of the 1st Speech Separation Challenge.

Credits

The following people contributed to the planning, development, collection, annotation and subsequent web-release of GRID: Jon Barker, Martin Cooke, Stuart Cunningham and Xu Shao.

This work was supported a grant from the University of Sheffield Research Fund.


Last update:  18th March 2013 by Jon Barker