1. Bike Sharing Dataset: This dataset contains the hourly and daily count of rental bikes between years 2011 and 2012 in Capital bikeshare system with the corresponding weather and seasonal information.
2. GitHub MUSAE: A social network of GitHub users with user-level attributes, connectivity data and a binary target variable.
3. Multimodal Damage Identification for Humanitarian Computing: 5879 captioned images (image and text) from social media related to damage during natural disasters/wars, and belong to 6 classes: Fires, Floods, Natural landscape, Infrastructural, Human, Non-damage.
4. NYSK: NYSK (New York v. Strauss-Kahn) is a collection of English news articles about the case relating to allegations of sexual assault against the former IMF director Dominique Strauss-Kahn (May 2011).
5. Twitter Data set for Arabic Sentiment Analysis: This problem of Sentiment Analysis (SA) has been studied well on the English language but not Arabic one. Two main approaches have been devised: corpus-based and lexicon-based.