FMA: A Dataset For Music Analysis

Donated on 5/23/2017

FMA features 106,574 tracks and includes song title, album, artist, genres; play counts, favorites, comments; description, biography, tags; together with audio (343 days, 917 GiB) and features.

Dataset Characteristics

Multivariate, Time-Series

Subject Area

Computer Science

Associated Tasks

Classification, Clustering

Feature Type

Real

# Instances

106574

# Features

-

Dataset Information

Additional Information

* Audio track (encoded as mp3) of each of the 106,574 tracks. It is on average 10 millions samples per track. * Nine audio features (consisting of 518 attributes) for each of the 106,574 tracks. * Given the metadata, multiple problems can be explored: recommendation, genre recognition, artist identification, year prediction, music annotation, unsupervized categorization. * The dataset is split into four sizes: small, medium, large, full. * Please see the paper and the GitHub repository for more information (https://github.com/mdeff/fma)

Has Missing Values?

No

Variables Table

Variable NameRoleTypeDescriptionUnitsMissing Values
no
no
no
no
no
no
no
no
no
no

0 to 10 of 518

Additional Variable Information

Nine audio features computed across time and summarized with seven statistics (mean, standard deviation, skew, kurtosis, median, minimum, maximum): 1. Chroma, 84 attributes 2. Tonnetz, 42 attributes 3. Mel Frequency Cepstral Coefficient (MFCC), 140 attributes 4. Spectral centroid, 7 attributes 5. Spectral bandwidth, 7 attributes 6. Spectral contrast, 49 attributes 7. Spectral rolloff, 7 attributes 8. Root Mean Square energy, 7 attributes 9. Zero-crossing rate, 7 attributes

Dataset Files

FileSize
fma.txt201 Bytes

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (313 Bytes)
0 citations
9108 views

Creators

Michal Defferrard

Kirell Benzi

Pierre Vandergheynst

Xavier Bresson

Notes

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy