
Twenty Newsgroups
Donated on 9/8/1999
This data set consists of 20000 messages taken from 20 newsgroups.
Dataset Characteristics
Text
Subject Area
Other
Associated Tasks
-
Feature Type
-
# Instances
20000
# Features
-
Dataset Information
Has Missing Values?
No
Download
0 citations
3792 views
Citation
Mitchell,Tom. (1999). Twenty Newsgroups. UCI Machine Learning Repository. https://doi.org/10.24432/C5C323.
BibTeX
@misc{misc_twenty_newsgroups_113, author = {Mitchell,Tom}, title = {{Twenty Newsgroups}}, year = {1999}, howpublished = {UCI Machine Learning Repository}, note = {{DOI}: https://doi.org/10.24432/C5C323} }
Install the ucimlrepo package
pip install ucimlrepo
Import the dataset into your code
View the full documentationfrom ucimlrepo import fetch_ucirepo # fetch dataset twenty_newsgroups = fetch_ucirepo(id=113) # data (as pandas dataframes) X = twenty_newsgroups.data.features y = twenty_newsgroups.data.targets # metadata print(twenty_newsgroups.metadata) # variable information print(twenty_newsgroups.variables)
Creators
Tom Mitchell
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.