Influenza Outbreak Event Prediction via Twitter
Donated on 8/14/2023
By identifying influenza-related tweets, the goal is to forecast the spatiotemporal patterns of influenza outbreaks for different locations and dates.
Dataset Characteristics
Multivariate
Subject Area
Health and Medicine
Associated Tasks
Classification
Feature Type
Real, Integer
# Instances
75839
# Features
523
Dataset Information
Additional Information
The data is from the United States. The data comes from different states under different weeks. For each week, the task is to predict whether or not there is an influenza outbreak on the next date. More specifically, for influenza activity, there are four levels of flu activities from minimal to high according to CDC Flu Activity Map. An influenza outbreak occurrence is indicated if the activity level is high.
Has Missing Values?
No
Introductory Paper
By Liang Zhao, Jiangzhuo Chen, F. Chen, W. Wang, Chang-Tien Lu, Naren Ramakrishnan. 2015
Published in 2015 IEEE International Conference on Data Mining
Variable Information
The input of the prediction task is the set of the keyword counts for all the tweets in a state in a week. The output is the occurrence of influenza outbreak for the specific state in the next week, which is zero if no event in the next week; or one, otherwise. Here are the briefs of all the variables: 'flu_locations': a list of states. 'flu_keywords': keyword list. 'flu_X_*': input data for all the locations and all the weeks. 'flu_Y_*': output data for all the locations and all the weeks. 525 keywords specified in the variable 'flu_keywords' in the data
Dataset Files
File | Size |
---|---|
influenza_outbreak_dataset.mat | 4.6 MB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset influenza_outbreak_event_prediction_via_twitter = fetch_ucirepo(id=861) # data (as pandas dataframes) X = influenza_outbreak_event_prediction_via_twitter.data.features y = influenza_outbreak_event_prediction_via_twitter.data.targets # metadata print(influenza_outbreak_event_prediction_via_twitter.metadata) # variable information print(influenza_outbreak_event_prediction_via_twitter.variables)
Zhao, L. (2015). Influenza Outbreak Event Prediction via Twitter [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5CP7V.
Creators
Liang Zhao
liang.zhao@emory.edu
Emory University
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.