Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

× Check out the beta version of the new UCI Machine Learning Repository we are currently testing! Contact us if you have any issues, questions, or concerns. Click here to try out the new site.

Opinion Corpus for Lebanese Arabic Reviews (OCLAR) Data Set
Download: Data Folder, Data Set Description

Abstract: Opinion Corpus for Lebanese Arabic Reviews (OCLAR) corpus is utilizable for Arabic sentiment classification on services’ reviews, including hotels, restaurants, shops, and others.

Data Set Characteristics:  

Text

Number of Instances:

3916

Area:

Computer

Attribute Characteristics:

Integer

Number of Attributes:

3916

Date Donated

2019-06-17

Associated Tasks:

Classification

Missing Values?

N/A

Number of Web Hits:

9773


Source:

Marwan Al Omari, Centre for Language Sciences and Communication, Lebanese University, Beirut, Lebanon, marwanalomari '@' yahoo.com
Moustafa Al-Hajj, Centre for Language Sciences and Communication, Lebanese University, Beirut, Lebanon, moustafa.alhajj '@' ul.edu.lb
Nacereddine Hammami, college of Computer and Information Sciences, Jouf University, Aljouf, KSA, n.hammami '@' ju.edu.sa
Amani Sabra, Centre for Language Sciences and Communication, Lebanese University, Beirut, Lebanon, amani.sabra '@' ul.edu.lb


Data Set Information:

The researchers of OCLAR Marwan et al. (2019), they gathered Arabic costumer reviews from ([Web Link]) and Zomato website ([Web Link]) on wide scope of domain, including restaurants, hotels, hospitals, local shops, etc. The corpus finally contains 3916 reviews in 5-rating scale. For this research purpose, the positive class considers rating stars from 5 to 3 of 3465 reviews, and the negative class is represented from values of 1 and 2 of about 451 texts.


Attribute Information:

1- 3916 text reviews
2- 5-rating scale: 1: 303
2: 148
3: 418
4: 734
5: 2313
Positive class includes rating stars from 5 to 3 of 3465 total.
Negative class include rating stars from 1 to 2 of 451 total.


Relevant Papers:

Al Omari, M., Al-Hajj, M., Hammami, N., & Sabra, A. (2019). Sentiment Classifier: Logistic Regression for Arabic Services’ Reviews in Lebanon. 2019 International Conference on Computer and Information Sciences (ICCIS), Sakaka, Saudi Arabia, 2019, pp. 1-5. Doi: 10.1109/ICCISci.2019.8716394



Citation Request:

Please cite the above paper if you make use of the OCLAR dataset.


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML