1. Reuter_50_50: The dataset is used for authorship identification in online Writeprint which is a new research field of pattern recognition. 2. YouTube Comedy Slam Preference Data: This dataset provides user vote data on which video from a pair of videos is funnier collected on YouTube Comedy Slam. The task is to automatically predict this preference based on video metadata. 3. SMS Spam Collection: The SMS Spam Collection is a public set of SMS labeled messages that have been collected for mobile phone spam research. |