Emotion Datasets

A large number of emotional speech datasets have processing scripts and metadata available in the ERTK repository. Currently the scripts only deal with audio data, but in future we intend to support video data.

Currently supported datasets:

AESDD
ASED
BAVED
CaFE
CMU-MOSEI
CREMA-D
DEMoS
EESC
EMO-DB
EmoFilm
EmoryNLP
EmoV-DB
EMOVO
ESD
eNTERFACE
IEMOCAP
JL-corpus
MELD
MESD
MESS
MLendSND
MSP-IMPROV
MSP-Podcast
Oréau
Portuguese
RAVDESS
SAVEE
SEMAINE
ShEMO
SmartKom
SUBESCO
TESS
URDU
VENEC (Public subset). The full dataset is available on request.
VIVAE

Standardising datasets

Each dataset has a subdirectory in the datasets directory. This subdirectory contains a process.py script, which takes a path to the original dataset, and converts it to a simple standard format, consisting of annotation CSVs and audio files. The script also generates a corpus.yaml file, which contains metadata about the dataset.

You can run the process.py script for a dataset, such as EMO-DB, as follows:

cd datasets/EMO-DB
python process.py /path/to/EMO-DB