NYSK

Donated on 10/10/2013

NYSK (New York v. Strauss-Kahn) is a collection of English news articles about the case relating to allegations of sexual assault against the former IMF director Dominique Strauss-Kahn (May 2011).

Dataset Characteristics

Multivariate, Sequential, Text

Subject Area

Social Science

Associated Tasks

Clustering

Feature Type

# Instances

10421

# Features

Dataset Information

Additional Information

Documents are first obtained via a Web search using AMIEI: an integrated platform for delivering enterprise intelligence, developed by AMI Software (http://www.amisw.com/en) with the following query: ``dsk'' OR ``strauss-kahn'' OR ``strauss-khan''. NYSK dataset was used to extract topic-sentiment correlation and evolution over time but may be used for other text mining tasks like topic extraction, sentiment analysis, etc.

Has Missing Values?

Variable Information

Documents are then filtered and presented in XML format. All XML fields are self explanatory.

Dataset Files

File	Size
nysk.xml	52.3 MB

Download (17.5 MB)

0 citations

1910 views

Creators

Aurlien Lauf

Leila Khouas

Mohamed Dermouche

DOI

10.24432/C56C8K

License

This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.