1. Balloons: Data previously used in cognitive psychology experiment; 4 data sets represent different conditions of an experiment
2. Sponge: Data on sponges; Attributes in spanish
3. Soybean (Small): Michalski's famous soybean disease database
4. Labor Relations: From Collective Bargaining Review
5. Shuttle Landing Control: Tiny database; all nominal values
6. Lenses: Database for fitting contact lenses
7. Lung Cancer: Lung cancer data; no attribute definitions
8. Post-Operative Patient: Dataset of patient features
9. Challenger USA Space Shuttle O-Ring: Task: predict the number of O-rings that experience thermal distress on a flight at 31 degrees F given data on the previous 23 shuttle flights
10. Trains: 2 data formats (structured, one-instance-per-line)
11. Opinosis Opinion ⁄ Review: This dataset contains sentences extracted from user reviews on a given topic. Example topics are “performance of Toyota Camry” and “sound quality of ipod nano”.
12. DBWorld e-mails: It contains 64 e-mails which I have manually collected from DBWorld mailing list. They are classified in: 'announces of conferences' and 'everything else'.