1. Northix: Northix is designed to be a schema matching benchmark problem for data integration of two entity relationship databases.
2. Robot Execution Failures: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque samples collected at regular time intervals
3. Syskill and Webert Web Page Ratings: This database contains HTML source of web pages plus the ratings of a single user on these web pages. Web pages are on four seperate subjects (Bands- recording artists; Goats; Sheep; and BioMedical)
4. Japanese Credit Screening: Includes domain theory (generated by talking to Japanese domain experts); data in Lisp
5. Audiology (Original): Nominal audiology dataset from Baylor
6. Heart Disease: 4 databases: Cleveland, Hungary, Switzerland, and the VA Long Beach
7. Mechanical Analysis: Fault diagnosis problem of electromechanical devices; also PUMPS DATA SET is newer version with domain theory and results
8. Low Resolution Spectrometer: From IRAS data -- NASA Ames Research Center
9. University: Data in original (LISP-readable) form
10. Demospongiae: Marine sponges of the Demospongiae class classification domain.
11. ILPD (Indian Liver Patient Dataset): This data set contains 10 variables that are age, gender, total Bilirubin, direct Bilirubin, total proteins, albumin, A/G ratio, SGPT, SGOT and Alkphos.
12. Fertility: 100 volunteers provide a semen sample analyzed according to the WHO 2010 criteria. Sperm concentration are related to socio-demographic data, environmental factors, health status, and life habits