Drug Reviews (Druglib.com)
Donated on 10/1/2018
The dataset provides patient reviews on specific drugs along with related conditions. Reviews and ratings are grouped into reports on the three aspects benefits, side effects and overall comment.
Dataset Characteristics
Multivariate, Text
Subject Area
Health and Medicine
Associated Tasks
Classification, Regression, Clustering
Feature Type
Integer
# Instances
4143
# Features
8
Dataset Information
Additional Information
The dataset provides patient reviews on specific drugs along with related conditions. Furthermore, reviews are grouped into reports on the three aspects benefits, side effects and overall comment. Additionally, ratings are available concerning overall satisfaction as well as a 5 step side effect rating and a 5 step effectiveness rating. The data was obtained by crawling online pharmaceutical review sites. The intention was to study (1) sentiment analysis of drug experience over multiple facets, i.e. sentiments learned on specific aspects such as effectiveness and side effects, (2) the transferability of models among domains, i.e. conditions, and (3) the transferability of models among different data sources (see 'Drug Review Dataset (Drugs.com)'). The data is split into a train (75%) a test (25%) partition (see publication) and stored in two .tsv (tab-separated-values) files, respectively. Important notes: When using this dataset, you agree that you 1) only use the data for research purposes 2) don't use the data for any commerical purposes 3) don't distribute the data to anyone else 4) cite us
Has Missing Values?
No
Introductory Paper
By F. Gräßer, Surya Kallumadi, H. Malberg, S. Zaunseder. 2018
Published in Digital Humanities Conference
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
reviewID | ID | Integer | no | ||
urlDrugName | Feature | Categorical | no | ||
rating | Feature | Integer | no | ||
effectiveness | Feature | Categorical | no | ||
sideEffects | Feature | Categorical | no | ||
condition | Feature | Categorical | no | ||
benefitsReview | Feature | Categorical | no | ||
sideEffectsReview | Feature | Categorical | no | ||
commentsReview | Feature | Categorical | no |
0 to 9 of 9
Additional Variable Information
1. urlDrugName (categorical): name of drug 2. condition (categorical): name of condition 3. benefitsReview (text): patient on benefits 4. sideEffectsReview (text): patient on side effects 5. commentsReview (text): overall patient comment 6. rating (numerical): 10 star patient rating 7. sideEffects (categorical): 5 step side effect rating 8. effectiveness (categorical): 5 step effectiveness rating
Dataset Files
File | Size |
---|---|
drugLibTrain_raw.tsv | 2.2 MB |
drugLibTest_raw.tsv | 774.5 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset drug_reviews_druglib_com = fetch_ucirepo(id=461) # data (as pandas dataframes) X = drug_reviews_druglib_com.data.features y = drug_reviews_druglib_com.data.targets # metadata print(drug_reviews_druglib_com.metadata) # variable information print(drug_reviews_druglib_com.variables)
Kallumadi, S. & Grer, F. (2018). Drug Reviews (Druglib.com) [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C55G6J.
Creators
Surya Kallumadi
Felix Grer
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.