Semeion Handwritten Digit
Donated on 11/10/2008
1593 handwritten digits from around 80 persons were scanned, stretched in a rectangular box 16x16 in a gray scale of 256 values.
Dataset Characteristics
Multivariate
Subject Area
Computer Science
Associated Tasks
Classification
Feature Type
Integer
# Instances
1593
# Features
-
Dataset Information
Additional Information
1593 handwritten digits from around 80 persons were scanned, stretched in a rectangular box 16x16 in a gray scale of 256 values.Then each pixel of each image was scaled into a bolean (1/0) value using a fixed threshold. Each person wrote on a paper all the digits from 0 to 9, twice. The commitment was to write the digit the first time in the normal way (trying to write each digit accurately) and the second time in a fast way (with no accuracy). The best validation protocol for this dataset seems to be a 5x2CV, 50% Tune (Train +Test) and completly blind 50% Validation
Has Missing Values?
No
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no |
0 to 10 of 256
Additional Variable Information
This dataset consists of 1593 records (rows) and 256 attributes (columns). Each record represents a handwritten digit, orginally scanned with a resolution of 256 grays scale (28). Each pixel of the each original scanned image was first stretched, and after scaled between 0 and 1 (setting to 0 every pixel whose value was under tha value 127 of the grey scale (127 included) and setting to 1 each pixel whose orinal value in the grey scale was over 127). Finally, each binary image was scaled again into a 16x16 square box (the final 256 binary attributes).
Dataset Files
File | Size |
---|---|
semeion.data | 2.8 MB |
semeion.names | 2.5 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset semeion_handwritten_digit = fetch_ucirepo(id=178) # data (as pandas dataframes) X = semeion_handwritten_digit.data.features y = semeion_handwritten_digit.data.targets # metadata print(semeion_handwritten_digit.metadata) # variable information print(semeion_handwritten_digit.variables)
Semeion Handwritten Digit [Dataset]. (1998). UCI Machine Learning Repository. https://doi.org/10.24432/C5SC8V.
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.