microblogPCU

Donated on 3/16/2015

MicroblogPCU data is crawled from sina weibo microblog[http://weibo.com/]. This data can be used to study machine learning methods as well as do some social network research.

Dataset Characteristics

Multivariate, Univariate, Sequential, Text

Subject Area

Computer Science

Associated Tasks

Classification, Causal-Discovery

Feature Type

Integer, Real

# Instances

221579

# Features

20

Dataset Information

Additional Information

Our dataset is used by us to explore spammers in microblog and you can access our demo system at http://sd.skyclass.net/Spammer/dia.jsp Please add :8080 after the domain name as port. The repository webpage fails to parse the weblink when it's added in the source. (under inspection)

Has Missing Values?

Yes

Variable Information

weibo_user.csv has the following attributes: -user_id: account ID in sina weibo; -user_name: account nickname; -gender:account registration gender including male, female and other; -class:account level given by sina weibo; -message:account registration location or other personal information; -post_num: the number of posts of this account up to now; -follower_num: the number of followers of this account; -followee_num: the number of followee of this account; -follow ratio: followee_num/follower_num; -is_spammer: manually annotated label, 1 means spammer and -1 means non-spammer; user_post.csv has the following attributes: -post_id:user post ID given by sina weibo; -post_time:the time when a post is posted; -poster_id: the user ID who posted this post; -repost_num:the number of retweet by others; -commnet_num: the number of comment by others; followe-followee.csv has the following attributes: -follower: the nickname of follower; -follower_id: the user ID of follower; -followee: the nickname of followee; -followee_id: the user ID of followee; post.csv is almost the as user_post.csv and the post in it are retrievalled by a certain key word related to a topic; -content: the post text(mostly in Chinese, please set your Microsoft Office to make it readable)

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download
0 citations
3186 views

Creators

Hao Chen

Mengting Zhan

Jianhong Mi

Yanzhang Lv

Jun Liu

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy