Farm Ads

Donated on 10/17/2011

This data was collected from text ads found on twelve websites that deal with various farm animal related topics. The binary labels are based on whether or not the content owner approves of the ad.

Dataset Characteristics

Text

Subject Area

Business

Associated Tasks

Classification

Feature Type

-

# Instances

4143

# Features

54877

Dataset Information

Additional Information

This data was collected from text ads found on twelve websites that deal with various farm animal related topics. Information from the ad creative and the ad landing page is included. The binary labels are based on whether or not the content owner approves of the ad. For each ad, we include the words on the ad creative and the words from the landing page. Each word from the creative is given a prefix of 'ad-'. Title and header HTML markups are noted in a similar way in the text of the landing page. We have already performed stemming and stop word removal. Each ad is on a single line. The first word in the line is the label of the instance. It is 1 for accepted ads and -1 for rejected ads. We have also included a straightforward bag-of-words representation of our data. We use the SVMlight sparse vector format. The first value is the label followed by every nonzero attribute. Each of these attributes is encoded as index:value. This is the representation used for the relevant paper cited below.

Has Missing Values?

No

Variable Information

Text words in file farm-ads. SVMlight format sparse vectors in file farm-ads-vect.

Dataset Files

FileSize
farm-ads12.7 MB
farm-ads-vect5.3 MB

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (5.1 MB)
0 citations
4120 views

Creators

Chris Mesterharm

Michael Pazzani

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy