Skip to content

Latest commit

 

History

History
197 lines (164 loc) · 18.6 KB

File metadata and controls

197 lines (164 loc) · 18.6 KB

Voice

Year Name Data Description
2023 ChoralSynth synthesized dataset of 20 multitrack choral songs
2022 M4Singer 20 singers covering 700 pop songs in Mandarin with annotated scores
2021 Schubert Winterreise Dataset 9 recordings (only 2 included) of Winterreise with scores and annotations
2021 Vocadito 40 excerpts of solo singing annotated with pitch
2020 Erkomaishvili 7 hours of Georgian Chant, XML, Audio, Onsets, Pitch
2018 VocalSet 10.1 hours of singers demonstrating 17 vocal techniques on 5 vowels
2018 DAMP-MVP recordings of sung karaoke tracks, lyrics text files, and metadata
2017 MAST Melody 1018 vocal performance assessments of 40 distinct melodies

| 2011 | MIR-1k | | 5000 Chinese pop songs (8 females, 11 males) with "mixture" and backing tracks |

NMED-RP

Piano

Year Name Data Description
2023 Batik_plays_Mozart 12 complete Mozart Piano Sonatas. MIDI, harmony, cadence and phrase annotations
2023 nASAP Note Aligned Scores with MAESTRO
2021 ACPAS Aligned audio and scores for classical piano music
2021 GiantMIDI Piano 1237 hours of transcribed piano solos in MIDI
2020 ASAP Beat Aligned Scores with MAESTRO
2018 MAESTRO 200 hours of solo piano performances with aligned audio and MIDI
2010 MAPS Midi aligned piano sounds

Guitar

Year Name Data Description
2023 Guitar Style Dataset electric guitar with various playing techniques
2018 GuitarSet acoustic guitar recordings with MIDI (360 samples, 30 seconds each)

Strings

Year Name Data Description
2023 Virtuoso Strings recordings of strings (1-4 players) with onset annotations
2020(?) Orchidea viola da gamba viola da gamba (goes to Orchidea page)

Misc. Ensemble

Year Name Data Description
2022 CocoChorales recordings/MIDI/annotations of Bach-ish generated chorales for different SATB ensembles

Winds

Year Name Data Description
2020(?) Orchidea multiphonics winds multiphonics (goes to Orchidea page)
2019 CBFdataset Chinese bamboo flute

Percussion

Year Name Data Description
2020(?) Orchidea tam tam tam tam with different playing styles (goes to Orchidea page)

Instruments

Year Name Data Description
2020 OrchideaSOL tiny and medium versions of SOL orchestral instruments collection (needs Ircam Form subscription)
2020 FullSOL (Paid) big version of SOL orchestral instruments collection (needs Premium Ircam Form subscription)
2019 Medley-SolosDB three-second audio clips from MedleyDB and 126 solos recordings from Joder et al.
2017 MDM-stem-synth 230 "solo stems" from MedleyDB (resynthesized instruments/voices)
2017 NSynth 300k+ annotated musical notes from 1000+ instruments
2016 GOOD-SOUNDS single notes and scales of 12 instruments by 15 musicians
2014 MedleyDB multitrack recordings for melody extraction and instrument recognition
2003 RWC 50 musical instruments (paid)
2012 IRMAS 10 musical instruments

Orchestration

Year Name Data Description
2017 Live Orchestral Piano 196 pairs of piano scores and corresponding orchestration

Score

Year Name Data Description
2025 PDMX 250K Public Domain MusicXml scores from MuseScore
2021 Annotated Mozart Sonatas Scores, chord labels and cadence labels for Mozart's 18 piano sonatas
2021 PrIMuS 87678 incipits in MIDI, MEI, PNG, agnostic, and semantic forms
2016 MusicNet 330 classical recording with 1M labels for instrument, timing, and position
2014 Schenker41 MusicXML of excerpts with Schenkerian Analysis
2012? Nottingham Folk Song MIDI and ABC notation of Folk songs

Lyrics

Year Name Data Description

Emotion

Year Name Data Description
2025 XMIDI 100K MIDI files (5K hours) with emotion and genre labels
2023 MERP 50 songs annotated with valence/arousal quadrants
2021 EMOPIA 1087 excerpts from 387 songs annotated with valence/arousal quadrants
2019 VGMIDI 200 valence/arousal quadrant labeled video game scores with MIDI
2019 NMED-RP EEG data of 5 adults listening to 16 different 30-second musical excerpts
2018 PMEmo 794 45-sec songs excerpts annotated with valence and arousal
2018 Deezer 18000 songs with synthetic valence and arousal (no audio)
2016 DEAM 1802 30-sec song excerpts annotated with valence and arousal
2015 AMG1608 1608 30-sec excerpts annotated with valence and arousal
2014 Emotify 400 60-sec excerpts with 9 emotion annotations
2013 EMO-Music 744 45-sec excerpts annotated
2011 DEAP 120 60-sec excerpts of music videos annotated arousal, valence, and dominance
2011 RAVDESS 2880 speech, 2024 song with emotional validity, intensity, and genuineness.

Genre

Year Name Data Description
2025 XMIDI 100K MIDI files (5K hours) with emotion and genre labels
2022 ComMU professionally-composed MIDI sequences with metadata (including genre)
2021 MTG-Jamendo 55k songs annotated with 87 genres
2019 AcousticBrainz audio features for 2M+ songs with genre annotations
2017 FMA 106,574 tracks annotated with 161 genres
2011 MSD extracted features for 1M song (no audio)

Form/Structure/Analysis

Year Name Data Description
2024 SynTheory synthetic audio/MIDI music theory dataset
2020 Bach Chorales Figured Bass 139 J.S. Bach chorales with figured bass (MusicXML)
2019 Harmonix beat and structure annocations for 912 Western pop songs (audio in mel spectrograms)
2011 SALAMI 2400 structure annotations for music of various genres (1400 recordings, must fetch manually)
2009 Isophonics beat/meter/key/chord/segmentation annotations for various popular artists (audio not included)

Tags

Year Name Data Description
2013 MagnaTagATune Tags for 31k+ audio clips
2011 AudioSet 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos

Style

Year Name Data Description
2018 Cross-Era Dataset Metadata for 2000 classical pieces balanced by era

Playlist

Year Name Data Description
2020 Spotify 1M Playlist 1M playlists with title and track list

Captioning

Year Name Data Description
2024 MidiCaps 168k MIDI files with captions (chords, time sig., genre, mood, etc.)
2023 MusicCaps 5521 examples with text captions
2023 MuLaMCap 400k music-text pseudolabels (not public)
2023 WavCaps 400k audio clips with paired captions
2023 Song Describer Dataset 1.1k captions for 706 recordings
2023 LP-MusicCaps 2M+ Pseudo-captions and tags sourced from MusicCaps, Million Song Dataset, MagnaTagATune
2023 Brain2Music 540 captions of 15s clips sourced from GTZAN
2019 AudioCaps 51k captions of AudioSet sounds.

Multimodal

Year Name Data Description
2021 TAU Urban Audio-Visual Scenes 2021 10 second audio + visual acoustic scenes
2021 PHENICX-SMM text/audio/images related to classical music (composers,performers,ensembles,etc.)
2018 MSMD synthetic dataset of 497 pieces of (classical) music that contains both audio and score representations
2016 Sub-URMP multimodal music analysis dataset of images/audio from music performance videos

Non-Speech Single Label

Year Name Data Description
2024 TT Sounds 3k+ table tennis racket-ball sounds for classification
2022 Sound Events Database actions performs on a variety of objects (5 exemplars per event)
2016 Acoustic Event Dataset 5k+ 28 class dataset of acoustic events (acoustic guitar, violin, bird, crowd, etc.) scraped from Freesound

Misc

Year Name Data Description
2024 BSD10k organizes 10k Freesound audio clips (23 classes) into a taxonomy
2021 MetaMIDI 400k+ MIDI files with metadata matched against Spotify audio clips
2020 SignalTrain recordings of sounds fed through the Teletronix LA-2A opto-electronic compressor
2011 Wikiphonia N/A 6,675 lead sheets in MusicXML format