1. Opinosis Opinion ⁄ Review: This dataset contains sentences extracted from user reviews on a given topic. Example topics are “performance of Toyota Camry” and “sound quality of ipod nano”.
2. Balloons: Data previously used in cognitive psychology experiment; 4 data sets represent different conditions of an experiment
3. Lenses: Database for fitting contact lenses
4. Challenger USA Space Shuttle O-Ring: Task: predict the number of O-rings that experience thermal distress on a flight at 31 degrees F given data on the previous 23 shuttle flights
5. Shuttle Landing Control: Tiny database; all nominal values
6. Post-Operative Patient: Dataset of patient features
7. Labor Relations: From Collective Bargaining Review
8. Trains: 2 data formats (structured, one-instance-per-line)
9. Predict keywords activities in a online social media: The data from Twitter was collected during 360 consecutive days. It was done by querying 1497 English keywords sampled from Wikipedia. This dataset is proposed in a Learning to rank setting.
10. Soybean (Small): Michalski's famous soybean disease database
11. Sponge: Data on sponges; Attributes in spanish
12. Lung Cancer: Lung cancer data; no attribute definitions
13. DBWorld e-mails: It contains 64 e-mails which I have manually collected from DBWorld mailing list. They are classified in: 'announces of conferences' and 'everything else'.