Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

× Check out the beta version of the new UCI Machine Learning Repository we are currently testing! Contact us if you have any issues, questions, or concerns. Click here to try out the new site.

Roman Urdu Sentiment Analysis Dataset (RUSAD) Data Set
Download: Data Folder, Data Set Description

Abstract: The dataset was gathered to carry out research on the task of sentiment analysis for Roman Urdu.

Data Set Characteristics:  

Text

Number of Instances:

11000

Area:

Computer

Attribute Characteristics:

N/A

Number of Attributes:

2

Date Donated

2021-02-16

Associated Tasks:

Classification

Missing Values?

N/A

Number of Web Hits:

2477


Source:

Khawar Mehmood (k.mehmood '@' unsw.edu.au), Daryl Essam (d.essam '@' unsw.edu.au), Muhammad Kamran Malik (kamran.malik '@' pucit.edu.pk)


Data Set Information:

The dataset has two columns. The first column has the binary categorical information (positive, negative) and the second column has the actual review.


Attribute Information:

There are two attributes of this dataset. The first attribute holds the binary categorical information (positive, negative) while the second attribute holds the actual review.


Relevant Papers:

Provide references to papers that have cited this data set in the past (if any).



Citation Request:

To view, download and use this dataset, please Cite the following papers (related to the dataset) in your research.



(1) Mehmood, Khawar, Daryl Essam, and Kamran Shafi. 'Sentiment analysis system for roman Urdu.' In Science and Information Conference, pp. 29-42. Springer, Cham, 2018.
(2) Mehmood, Khawar, Daryl Essam, Kamran Shafi, and Muhammad Kamran Malik. 'Sentiment Analysis for a Resource Poor Language—Roman Urdu.' ACM Transactions on Asian and Low-Resource Language
Information Processing (TALLIP) 19, no. 1 (2019): 10.
(3) Mehmood, Khawar, Daryl Essam, Kamran Shafi, and Muhammad Kamran Malik. 'Discriminative Feature Spamming Technique for Roman Urdu Sentiment Analysis.' IEEE Access 7 (2019): 47991-48002.
(4) Mehmood, Khawar, Daryl Essam, Kamran Shafi, and Muhammad Kamran Malik. 'An unsupervised lexical normalization for Roman Hindi and Urdu sentiment analysis.' Information Processing & Management
(2020): 102368.


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML