Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Character Font Images Data Set
Download: Data Folder, Data Set Description

Abstract: Character images from scanned and computer generated fonts.

Data Set Characteristics:  

Multivariate

Number of Instances:

745000

Area:

Computer

Attribute Characteristics:

Integer, Real

Number of Attributes:

411

Date Donated

2016-08-14

Associated Tasks:

Classification

Missing Values?

N/A

Number of Web Hits:

7309


Source:

Richard Lyman
459 Monterey Avenue
Los Gatos, California 95030
408 399 6303
richard.r.lyman '@' gmail.com


Data Set Information:

The data set consists of images from 153 character fonts. Some fonts were scanned from a variety of devices: hand scanners, desktop scanners or cameras. Other fonts were computer generated. The .zip file contains .csv, comma delimited files, one for each font. Each .csv file has a header row with the data set attribute names.

The Handprint images differ slightly from the standard MNIST dataset.


Attribute Information:

field Type Unique Example Description
font string 153 ‘times’ font family
fontVariant string 248 ‘times new roman’ If the font image was from a scanner,
the fontVariant is “scanned” otherwise it is the font name.
m_label integer 11597 33 to 65535 The character value, for instance 48 for the digit, ‘0’
strength real 2 .4 A value 0 to 1, indicating normal or bold
italic integer 2 1 A flag, if 1, the image was computer generated with the an italic font.
m_top integer 13 The topmost black pixel row index in the original image from which the image was cut
m_left integer 43 The leftmost black pixel column index in the original image from which the image was cut
originalH integer 30 The original height of the image in pixels
originalW integer 36 The original width of the image in pixels
h integer 1 20 The image height in this sample, always 20
w integer 1 20 The image width in this sample, always 20
r0c0 integer 0 Row 0 Column 0 pixel value, 0 to 255, white is 0, 255 is black
r0c1 integer 255 Row 0, Column 1 pixel value, 0 to 255
… 397 integer 0 397 pixel values, 0 to 255
r19c19 integer 255 Row 19, Column 19 pixel value, 0 to 255


Relevant Papers:

A utility Python program for accessing this data set, script files, and approximately 40 machine learning programs adapted from the book, “Python Machine Learning” by Sebastian Raschka, can be found at: [Web Link]



Citation Request:

Please refer to the Machine Learning Repository's citation policy


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML