Amazon Commerce Reviews

Donated on 6/10/2011

The dataset is used for authorship identification in online Writeprint which is a new research field of pattern recognition.

Dataset Characteristics

Multivariate, Text, Domain-Theory

Subject Area

Other

Associated Tasks

Classification

Feature Type

Real

# Instances

1500

# Features

10000

Dataset Information

Additional Information

dataset are derived from the customers’ reviews in Amazon Commerce Website for authorship identification. Most previous studies conducted the identification experiments for two to ten authors. But in the online context, reviews to be identified usually have more potential authors, and normally classification algorithms are not adapted to large number of target classes. To examine the robustness of clasification algorithms, we identified 50 of the most active users (represented by a unique ID and username) who frequently posted reviews in these newsgroups. The number of reviews we collected for each author is 30.

Has Missing Values?

No

Variable Information

attribution includes authors' lingustic style such as usage of digit, punctuation, words and sentences' length and usage frequency of words and so on

Dataset Files

FileSize
Amazon_initial_50_30_10000.rar2.1 MB

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (2.1 MB)
0 citations
9960 views

Creators

Zhi Liu

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy