Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Document Understanding Data Set
Download: Data Folder, Data Set Description

Abstract: Five concepts, expressed as predicates, to be learned

Data Set Characteristics:  

N/A

Number of Instances:

N/A

Area:

N/A

Attribute Characteristics:

N/A

Number of Attributes:

N/A

Date Donated

1994-11-01

Associated Tasks:

N/A

Missing Values?

No

Number of Web Hits:

25599


Source:

Owner:

Donato Malerba
Dipartimento di Informatica
University of Bari
via Orabona 4
70126 Bari - Italy
phone: +39 - 80 - 5443269
fax: +39 - 80 - 5443196
malerbad '@' vm.csata.it

Donor:

Donato Malerba


Data Set Information:

In the experimentation, 30 single page documents were considered. They are copies of letters sent by Olivetti. Six trials were performed by randomly selecting 20 documents for the training set and 10 for the test set. Each document is identified by a letter (A to Z) or a pair of letters (AA, AB, AC, AD).

Trial Training documents
1 A B C D E F G H I J K L M N O P Q R S T
2 C D E F G H I M P R S V X Y W Z AA AB AC AD
3 C D E F G H I J K P R S T U V Y W AA AB AC
4 A B C D E F G J L M N O P Q T V X Z AB AD
5 A B E F G I J K M N O P Q R T V X Z AA AD
6 A B C D E F G I J M Q S T X Y Z AA AB AC AD


Attribute Information:

N/A


Relevant Papers:

Malerba D. Document Understanding: A Machine Learning Approach. Technical Report, Esprit Project 5203 INTREPID, 4 March 1993.
[Web Link]

Esposito F., Malerba D., Semeraro G., & Pazzani M. A Machine Learning Approach to Document Understanding. Proc. 2nd Int. Workshop on Multistrategy Learning, Harpers Ferry, WV, pp. 276-292, May 1993.
[Web Link]

Esposito F., Malerba D., & Semeraro G. Learning Contextual Rules in First-Order Logic. Proc. 4th Italian Workshop on Machine Learning (GAA93), Milan, Italy, pp. 111-127, June 1993.

Esposito F., Malerba D., & Semeraro G. Automated Acquisition of Rules for Document Understanding. Proc. of the 2nd Int. Conf. on Document Analysis and Recognition, Tsukuba Science City, Japan, pp. 650-654, October 1993.
[Web Link]

Semeraro G., Esposito F., & Malerba D. Learning Contextual Rules for Document Understanding. Proc. 10th IEEE Conf. on Artificial Intelligence for Applications. San Antonio, Texas, pp. 108-115, March 1994.
[Web Link]

Esposito F., Malerba D., & Semeraro G. Multistrategy Learning for Document Recognition. Applied Artificial Intelligence, 8, pp. 33-84, 1994
[Web Link]



Citation Request:

Please refer to the Machine Learning Repository's citation policy


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML