Farm Ads

Donated on 10/17/2011

This data was collected from text ads found on twelve websites that deal with various farm animal related topics. The binary labels are based on whether or not the content owner approves of the ad.

Dataset Characteristics

Text

Subject Area

Business

Associated Tasks

Classification

Feature Type

# Instances

4143

# Features

54877

Dataset Information

Additional Information

This data was collected from text ads found on twelve websites that deal with various farm animal related topics. Information from the ad creative and the ad landing page is included. The binary labels are based on whether or not the content owner approves of the ad. For each ad, we include the words on the ad creative and the words from the landing page. Each word from the creative is given a prefix of 'ad-'. Title and header HTML markups are noted in a similar way in the text of the landing page. We have already performed stemming and stop word removal. Each ad is on a single line. The first word in the line is the label of the instance. It is 1 for accepted ads and -1 for rejected ads. We have also included a straightforward bag-of-words representation of our data. We use the SVMlight sparse vector format. The first value is the label followed by every nonzero attribute. Each of these attributes is encoded as index:value. This is the representation used for the relevant paper cited below.

Has Missing Values?

Variable Information

Text words in file farm-ads. SVMlight format sparse vectors in file farm-ads-vect.

Dataset Files

File	Size
farm-ads	12.7 MB
farm-ads-vect	5.3 MB

Reviews

There are no reviews for this dataset yet.

Download (5.1 MB)

0 citations

3877 views

Creators

Chris Mesterharm

Michael Pazzani

DOI

10.24432/C5ZC8D

License

This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.