Sports articles for objectivity analysis

Donated on 4/8/2018

1000 sports articles were labeled using Amazon Mechanical Turk as objective or subjective. The raw texts, extracted features, and the URLs from which the articles were retrieved are provided.

Dataset Characteristics

Multivariate, Text

Subject Area

Social Science

Associated Tasks

Classification

Feature Type

Integer

# Instances

1000

# Features

-

Dataset Information

Additional Information

Some of the features are retrieved using the Stanford POS tagger and the tags are as defined in Penn Treebank Project: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

Has Missing Values?

No

Variables Table

Variable NameRoleTypeDescriptionUnitsMissing Values
no
no
no
no
no
no
no
no
no
no

0 to 10 of 59

Additional Variable Information

TextID text file name URL link to article Label objective vs. subjective totalWordsCount total number of words in the article semanticobjscore Frequency of words with an objective SENTIWORDNET score semanticsubjscore Frequency of words with a subjective SENTIWORDNET score CC Frequency of coordinating conjunctions CD Frequency of numerals and cardinals DT Frequency of determiners EX Frequency of existential there FW Frequency of foreign words INs Frequency of subordinating preposition or conjunction JJ Frequency of ordinal adjectives or numerals JJR Frequency of comparative adjectives JJS Frequency of superlative adjectives LS Frequency of list item markers MD Frequency of modal auxiliaries NN Frequency of singular common nouns NNP Frequency of singular proper nouns NNPS Frequency of plural proper nouns NNS Frequency of plural common nouns PDT Frequency of pre-determiners POS Frequency of genitive markers PRP Frequency of personal pronouns PRP$ Frequency of possessive pronouns RB Frequency of adverbs RBR Frequency of comparative adverbs RBS Frequency of superlative adverbs RP Frequency of particles SYM Frequency of symbols TOs Frequency of 'to' as preposition or infinitive marker UH Frequency of interjections VB Frequency of base form verbs VBD Frequency of past tense verbs VBG Frequency of present participle or gerund verbs VBN Frequency of past participle verbs VBP Frequency of present tense verbs with plural 3rd person subjects VBZ Frequency of present tense verbs with singular 3rd person subjects WDT Frequency of WH-determiners WP Frequency of WH-pronouns WP$ Frequency of possessive WH-pronouns WRB Frequency of WH-adverbs baseform Frequency of infinitive verbs (base form verbs preceded by “to”) Quotes Frequency of quotation pairs in the entire article questionmarks Frequency of questions marks in the entire article exclamationmarks Frequency of exclamation marks in the entire article fullstops Frequency of full stops commas Frequency of commas semicolon Frequency of semicolons colon Frequency of colons ellipsis Frequency of ellipsis pronouns1st Frequency of first person pronouns (personal and possessive) pronouns2nd Frequency of second person pronouns (personal and possessive) pronouns3rd Frequency of third person pronouns (personal and possessive) compsupadjadv Frequency of comparative and superlative adjectives and adverbs past Frequency of past tense verbs with 1st and 2nd person pronouns imperative Frequency of imperative verbs present3rd Frequency of present tense verbs with 3rd person pronouns present1st2nd Frequency of present tense verbs with 1st and 2nd person pronouns sentence1st First sentence class sentencelast Last sentence class txtcomplexity Text complexity score

Dataset Files

FileSize
features.xls3.1 MB
Raw data/Text0918.txt21.9 KB
Raw data/Text0326.txt20.3 KB
Raw data/Text0712.txt19.8 KB
Raw data/Text0709.txt16 KB

0 to 5 of 1001

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (2.1 MB)
0 citations
5764 views

Creators

Yara Rizk

Mariette Awad

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy