Sports articles for objectivity analysis
Donated on 4/8/2018
1000 sports articles were labeled using Amazon Mechanical Turk as objective or subjective. The raw texts, extracted features, and the URLs from which the articles were retrieved are provided.
Dataset Characteristics
Multivariate, Text
Subject Area
Social Science
Associated Tasks
Classification
Feature Type
Integer
# Instances
1000
# Features
-
Dataset Information
Additional Information
Some of the features are retrieved using the Stanford POS tagger and the tags are as defined in Penn Treebank Project: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
Has Missing Values?
No
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no |
0 to 10 of 59
Additional Variable Information
TextID text file name URL link to article Label objective vs. subjective totalWordsCount total number of words in the article semanticobjscore Frequency of words with an objective SENTIWORDNET score semanticsubjscore Frequency of words with a subjective SENTIWORDNET score CC Frequency of coordinating conjunctions CD Frequency of numerals and cardinals DT Frequency of determiners EX Frequency of existential there FW Frequency of foreign words INs Frequency of subordinating preposition or conjunction JJ Frequency of ordinal adjectives or numerals JJR Frequency of comparative adjectives JJS Frequency of superlative adjectives LS Frequency of list item markers MD Frequency of modal auxiliaries NN Frequency of singular common nouns NNP Frequency of singular proper nouns NNPS Frequency of plural proper nouns NNS Frequency of plural common nouns PDT Frequency of pre-determiners POS Frequency of genitive markers PRP Frequency of personal pronouns PRP$ Frequency of possessive pronouns RB Frequency of adverbs RBR Frequency of comparative adverbs RBS Frequency of superlative adverbs RP Frequency of particles SYM Frequency of symbols TOs Frequency of 'to' as preposition or infinitive marker UH Frequency of interjections VB Frequency of base form verbs VBD Frequency of past tense verbs VBG Frequency of present participle or gerund verbs VBN Frequency of past participle verbs VBP Frequency of present tense verbs with plural 3rd person subjects VBZ Frequency of present tense verbs with singular 3rd person subjects WDT Frequency of WH-determiners WP Frequency of WH-pronouns WP$ Frequency of possessive WH-pronouns WRB Frequency of WH-adverbs baseform Frequency of infinitive verbs (base form verbs preceded by “toâ€) Quotes Frequency of quotation pairs in the entire article questionmarks Frequency of questions marks in the entire article exclamationmarks Frequency of exclamation marks in the entire article fullstops Frequency of full stops commas Frequency of commas semicolon Frequency of semicolons colon Frequency of colons ellipsis Frequency of ellipsis pronouns1st Frequency of first person pronouns (personal and possessive) pronouns2nd Frequency of second person pronouns (personal and possessive) pronouns3rd Frequency of third person pronouns (personal and possessive) compsupadjadv Frequency of comparative and superlative adjectives and adverbs past Frequency of past tense verbs with 1st and 2nd person pronouns imperative Frequency of imperative verbs present3rd Frequency of present tense verbs with 3rd person pronouns present1st2nd Frequency of present tense verbs with 1st and 2nd person pronouns sentence1st First sentence class sentencelast Last sentence class txtcomplexity Text complexity score
Dataset Files
File | Size |
---|---|
features.xls | 3.1 MB |
Raw data/Text0918.txt | 21.9 KB |
Raw data/Text0326.txt | 20.3 KB |
Raw data/Text0712.txt | 19.8 KB |
Raw data/Text0709.txt | 16 KB |
0 to 5 of 1001
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset sports_articles_for_objectivity_analysis = fetch_ucirepo(id=450) # data (as pandas dataframes) X = sports_articles_for_objectivity_analysis.data.features y = sports_articles_for_objectivity_analysis.data.targets # metadata print(sports_articles_for_objectivity_analysis.metadata) # variable information print(sports_articles_for_objectivity_analysis.variables)
Rizk, Y. & Awad, M. (2018). Sports articles for objectivity analysis [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5801R.
Creators
Yara Rizk
Mariette Awad
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.