Character Font Images
Donated on 8/13/2016
Character images from scanned and computer generated fonts.
Dataset Characteristics
Multivariate
Subject Area
Computer Science
Associated Tasks
Classification
Feature Type
Integer, Real
# Instances
745000
# Features
-
Dataset Information
Additional Information
The data set consists of images from 153 character fonts. Some fonts were scanned from a variety of devices: hand scanners, desktop scanners or cameras. Other fonts were computer generated. The .zip file contains .csv, comma delimited files, one for each font. Each .csv file has a header row with the data set attribute names. The Handprint images differ slightly from the standard MNIST dataset.
Has Missing Values?
No
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no |
0 to 10 of 411
Additional Variable Information
field Type Unique Example Description font string 153 ‘times’ font family fontVariant string 248 ‘times new roman’ If the font image was from a scanner, the fontVariant is “scanned†otherwise it is the font name. m_label integer 11597 33 to 65535 The character value, for instance 48 for the digit, ‘0’ strength real 2 .4 A value 0 to 1, indicating normal or bold italic integer 2 1 A flag, if 1, the image was computer generated with the an italic font. m_top integer 13 The topmost black pixel row index in the original image from which the image was cut m_left integer 43 The leftmost black pixel column index in the original image from which the image was cut originalH integer 30 The original height of the image in pixels originalW integer 36 The original width of the image in pixels h integer 1 20 The image height in this sample, always 20 w integer 1 20 The image width in this sample, always 20 r0c0 integer 0 Row 0 Column 0 pixel value, 0 to 255, white is 0, 255 is black r0c1 integer 255 Row 0, Column 1 pixel value, 0 to 255 … 397 integer 0 397 pixel values, 0 to 255 r19c19 integer 255 Row 19, Column 19 pixel value, 0 to 255
Dataset Files
File | Size |
---|---|
OCRB.csv | 117.1 MB |
SEGOE.csv | 95.3 MB |
HANDPRINT.csv | 80.8 MB |
OCRA.csv | 77.8 MB |
CREDITCARD.csv | 42.1 MB |
0 to 5 of 153
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset character_font_images = fetch_ucirepo(id=417) # data (as pandas dataframes) X = character_font_images.data.features y = character_font_images.data.targets # metadata print(character_font_images.metadata) # variable information print(character_font_images.variables)
Lyman, R. (2016). Character Font Images [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5X61Q.
Creators
Richard Lyman
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.