Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Opinion Corpus for Lebanese Arabic Reviews (OCLAR) Data Set
Download: Data Folder, Data Set Description

Abstract: Opinion Corpus for Lebanese Arabic Reviews (OCLAR) corpus is utilizable for Arabic sentiment classification on services’ reviews, including hotels, restaurants, shops, and others.

Data Set Characteristics:  

Text

Number of Instances:

3916

Area:

Computer

Attribute Characteristics:

Integer

Number of Attributes:

3916

Date Donated

2019-06-17

Associated Tasks:

Classification

Missing Values?

N/A

Number of Web Hits:

352


Source:

Marwan Al Omari, Centre for Language Sciences and Communication, Lebanese University, Beirut, Lebanon, marwanalomari '@' yahoo.com
Moustafa Al-Hajj, Centre for Language Sciences and Communication, Lebanese University, Beirut, Lebanon, moustafa.alhajj '@' ul.edu.lb
Nacereddine Hammami, college of Computer and Information Sciences, Jouf University, Aljouf, KSA, n.hammami '@' ju.edu.sa
Amani Sabra, Centre for Language Sciences and Communication, Lebanese University, Beirut, Lebanon, amani.sabra '@' ul.edu.lb


Data Set Information:

The researchers of OCLAR Marwan et al. (2019), they gathered Arabic costumer reviews from ([Web Link]) and Zomato website ([Web Link]) on wide scope of domain, including restaurants, hotels, hospitals, local shops, etc. The corpus finally contains 3916 reviews in 5-rating scale. For this research purpose, the positive class considers rating stars from 5 to 3 of 3465 reviews, and the negative class is represented from values of 1 and 2 of about 451 texts.


Attribute Information:

1- 3916 text reviews
2- 5-rating scale: 1: 303
2: 148
3: 418
4: 734
5: 2313
Positive class includes rating stars from 5 to 3 of 3465 total.
Negative class include rating stars from 1 to 2 of 451 total.


Relevant Papers:

Al Omari, M., Al-Hajj, M., Hammami, N., & Sabra, A. (2019). Sentiment Classifier: Logistic Regression for Arabic Services’ Reviews in Lebanon. 2019 International Conference on Computer and Information Sciences (ICCIS), Sakaka, Saudi Arabia, 2019, pp. 1-5. Doi: 10.1109/ICCISci.2019.8716394



Citation Request:

Please cite the above paper if you make use of the OCLAR dataset.


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML