1. Northix: Northix is designed to be a schema matching benchmark problem for data integration of two entity relationship databases. 2. Robot Execution Failures: This dataset contains force and torque measurements on a robot after failure detection. Each failure is characterized by 15 force/torque samples collected at regular time intervals 3. Syskill and Webert Web Page Ratings: This database contains HTML source of web pages plus the ratings of a single user on these web pages. Web pages are on four seperate subjects (Bands- recording artists; Goats; Sheep; and BioMedical) 4. Japanese Credit Screening: Includes domain theory (generated by talking to Japanese domain experts); data in Lisp 5. Audiology (Original): Nominal audiology dataset from Baylor 6. Heart Disease: 4 databases: Cleveland, Hungary, Switzerland, and the VA Long Beach 7. Mechanical Analysis: Fault diagnosis problem of electromechanical devices; also PUMPS DATA SET is newer version with domain theory and results 8. Low Resolution Spectrometer: From IRAS data -- NASA Ames Research Center 9. University: Data in original (LISP-readable) form 10. Demospongiae: Marine sponges of the Demospongiae class classification domain. 11. ILPD (Indian Liver Patient Dataset): This data set contains 10 variables that are age, gender, total Bilirubin, direct Bilirubin, total proteins, albumin, A/G ratio, SGPT, SGOT and Alkphos. 12. Fertility: 100 volunteers provide a semen sample analyzed according to the WHO 2010 criteria. Sperm concentration are related to socio-demographic data, environmental factors, health status, and life habits |