Health News in Twitter

Donated on 2/18/2018

The data was collected in 2015 using Twitter API. This dataset contains health news from more than 15 major health news agencies such as BBC, CNN, and NYT.

Dataset Characteristics

Text

Subject Area

Computer Science

Associated Tasks

Clustering

Feature Type

Real

# Instances

58000

# Features

25000

Dataset Information

Additional Information

Each file is related to one Twitter account of a news agency. For example, bbchealth.txt is related to BBC health news. Each line contains tweet id|date and time|tweet. The separator is '|'. This text data has been used to evaluate the performance of topic models on short text data. However, it can be used for other tasks such as clustering.

Has Missing Values?

No

Dataset Files

FileSize
Health-Tweets/goodhealth.txt1.2 MB
Health-Tweets/nytimeshealth.txt880.2 KB
Health-Tweets/cbchealth.txt663.6 KB
Health-Tweets/cnnhealth.txt637.8 KB
Health-Tweets/reuters_health.txt633.9 KB

0 to 5 of 33

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (3.3 MB)
0 citations
5673 views

Creators

Amir Karami

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy