1. Syskill and Webert Web Page Ratings: This database contains HTML source of web pages plus the ratings of a single user on these web pages. Web pages are on four seperate subjects (Bands- recording artists; Goats; Sheep; and BioMedical)
2. Reuters-21578 Text Categorization Collection: This is a collection of documents that appeared on Reuters newswire in 1987. The documents were assembled and indexed with categories.
3. MSNBC.com Anonymous Web Data: This data describes the page visits of users who visited msnbc.com on September 28, 1999. Visits are recorded at the level of URL category (see description) and are recorded in time order.
4. Molecular Biology (Protein Secondary Structure): From CMU connectionist bench repository; Classifies secondary structure of certain globular proteins
5. Kinship: Relational dataset
6. HIV-1 protease cleavage: The data contains lists of octamers (8 amino acids) and a flag (-1 or 1) depending on whether HIV-1 protease will cleave in the central position (between amino acids 4 and 5).
7. Entree Chicago Recommendation Data: This data contains a record of user interactions with the Entree Chicago restaurant recommendation system.
8. Audiology (Original): Nominal audiology dataset from Baylor