1. Northix: Northix is designed to be a schema matching benchmark problem for data integration of two entity relationship databases.
2. Urban Land Cover: Classification of urban land cover using high resolution aerial imagery. Intended to assist sustainable urban planning efforts.
3. NoisyOffice: Corpus intended to do cleaning (or binarization) and enhancement of noisy grayscale printed text images using supervised learning methods. Noisy images and their corresponding ground truth provided.
4. Low Resolution Spectrometer: From IRAS data -- NASA Ames Research Center
5. Hill-Valley: Each record represents 100 points on a two-dimensional graph. When plotted in order (from 1 through 100) as the Y co-ordinate, the points will create either a Hill (a “bump” in the terrain) or a Valley (a “dip” in the terrain).