1. Amazon Commerce reviews set: The dataset is used for authorship identification in online Writeprint which is a new research field of pattern recognition.
2. Greenhouse Gas Observing Network: Design an observing network to monitor emissions of a greenhouse gas (GHG) in California given time series of synthetic observations and tracers from weather model simulations.
3. Low Resolution Spectrometer: From IRAS data -- NASA Ames Research Center
4. Musk (Version 1): The goal is to learn to predict whether new molecules will be musks or non-musks
5. Musk (Version 2): The goal is to learn to predict whether new molecules will be musks or non-musks
6. QSAR androgen receptor: 1024 binary attributes (molecular fingerprints) used to classify 1687 chemicals into 2 classes (binder to androgen receptor/positive, non-binder to androgen receptor /negative)
7. QSAR oral toxicity: Data set containing values for 1024 binary attributes (molecular fingerprints) used to classify 8992 chemicals into 2 classes (very toxic/positive, not very toxic/negative)
8. Urban Land Cover: Classification of urban land cover using high resolution aerial imagery. Intended to assist sustainable urban planning efforts.
9. Weight Lifting Exercises monitored with Inertial Measurement Units: Six young health subjects were asked to perform 5 variations of the biceps curl weight lifting exercise. One of the variations is the one predicted by the health professional.