MNIST Database of Handwritten Digits
Linked on 10/16/2021
Well-known database of 70,000 handwritten digits (10 class labels) with each example represented as an image of 28 x 28 gray-scale pixels.
Dataset Characteristics
Image
Subject Area
Other
Associated Tasks
Classification
Feature Type
Real
# Instances
70000
# Features
-
Dataset Information
For what purpose was the dataset created?
As a testbed for development of handwriting recognition algorithms and machine learning classification algorithms in general.
Who funded the creation of the dataset?
The US National Institute of Standards and Technology (NIST) originally, and later, AT&T Bell Labs
What do the instances in this dataset represent?
28 x 28 gray-scale centered images of handwritten digites
Are there recommended data splits?
Yes. A specific split with 60,000 for training, 10,000 for testing.
Was there any data preprocessing performed?
The original NIST data was preprocessed by Yann LeCun and colleagues at AT&T Bell Labs: see http://yann.lecun.com/exdb/mnist/ for details
Has Missing Values?
No
Introductory Paper
By Y. LeCun, L. Bottou, Y. Bengio, P. Haffner. 1998
Published in Proceedings of the IEEE
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset mnist_database_of_handwritten_digits = fetch_ucirepo(id=683) # data (as pandas dataframes) X = mnist_database_of_handwritten_digits.data.features y = mnist_database_of_handwritten_digits.data.targets # metadata print(mnist_database_of_handwritten_digits.metadata) # variable information print(mnist_database_of_handwritten_digits.variables)
MNIST Database of Handwritten Digits [Dataset]. (1998). UCI Machine Learning Repository. https://doi.org/10.24432/C53K8Q.
Citations/Acknowledgements
If you use this dataset, please follow the acknowledgment policy on the original dataset website.