Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact

Repository Web            Google
View ALL Data Sets

× Check out the beta version of the new UCI Machine Learning Repository we are currently testing! Contact us if you have any issues, questions, or concerns. Click here to try out the new site.

NYSK Data Set
Download: Data Folder, Data Set Description

Abstract: NYSK (New York v. Strauss-Kahn) is a collection of English news articles about the case relating to allegations of sexual assault against the former IMF director Dominique Strauss-Kahn (May 2011).

Data Set Characteristics:  

Multivariate, Sequential, Text

Number of Instances:




Attribute Characteristics:


Number of Attributes:


Date Donated


Associated Tasks:


Missing Values?


Number of Web Hits:



- Aurélien Lauf (alu '@'
- Leila Khouas (lkh '@'
- Mohamed Dermouche (mde '@'

Data Set Information:

Documents are first obtained via a Web search using AMIEI: an integrated platform for delivering enterprise intelligence, developed by AMI Software ([Web Link]) with the following query: ``dsk'' OR ``strauss-kahn'' OR ``strauss-khan''.

NYSK dataset was used to extract topic-sentiment correlation and evolution over time but may be used for other text mining tasks like topic extraction, sentiment analysis, etc.

Attribute Information:

Documents are then filtered and presented in XML format. All XML fields are self explanatory.

Relevant Papers:

(1) Mohamed Dermouche, Julien Velcin, Leila Khouas, and Sabine Loudcher. A Joint Model for Topic-Sentiment Evolution over Time. In Proceedings of The IEEE 14th International Conference on Data Mining (ICDM’2014), pages 773–778, Shenzhen, China, 2014. IEEE Computer Society.

(2) Mohamed Dermouche, Leila Khouas, Julien Velcin, and Sabine Loudcher. A Joint Model for Topic-Sentiment Modeling from Text. In Proceedings of The 30th ACM/SIGAPP Symposium On Applied Computing (SAC’2015), pages 819--824, Salamanca, Spain, 2015. ACM.

Citation Request:

Please refer to the Machine Learning Repository's citation policy

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML