1. Roman Urdu Data Set: Roman Urdu (the scripting style for Urdu language) is one of the limited resource languages.A data corpus comprising of more than 20000 records was collected.
2. Online Retail II: A real online retail transaction data set of two years.
3. Guitar Chords finger positions: Position of the fingers for 2633 guitar chords in standard tuning (double checked with software)
4. Eco-hotel: This dataset includes Online Textual Reviews from both online (e.g., TripAdvisor) and offline (e.g., Guests' book) sources from the Areias do Seixo Eco-Resort.
5. Drug Review Dataset (Drugs.com): The dataset provides patient reviews on specific drugs along with related conditions and a 10 star patient rating reflecting overall patient satisfaction.
6. Drug Review Dataset (Druglib.com): The dataset provides patient reviews on specific drugs along with related conditions. Reviews and ratings are grouped into reports on the three aspects benefits, side effects and overall comment.
7. BuddyMove Data Set: User interest information extracted from user reviews published in holidayiq.com about various types of point of interests in South India
8. Badges: Badges labeled with a "+" or "-" as a function of a person's name