Opinion Corpus for Lebanese Arabic Reviews (OCLAR)
Donated on 6/16/2019
Opinion Corpus for Lebanese Arabic Reviews (OCLAR) corpus is utilizable for Arabic sentiment classification on services’ reviews, including hotels, restaurants, shops, and others.
Dataset Characteristics
Text
Subject Area
Computer Science
Associated Tasks
Classification
Feature Type
Integer
# Instances
3916
# Features
3916
Dataset Information
Additional Information
The researchers of OCLAR Marwan et al. (2019), they gathered Arabic costumer reviews from (https://maps.google.com) and Zomato website (https://www.zomato.com/lebanon) on wide scope of domain, including restaurants, hotels, hospitals, local shops, etc. The corpus finally contains 3916 reviews in 5-rating scale. For this research purpose, the positive class considers rating stars from 5 to 3 of 3465 reviews, and the negative class is represented from values of 1 and 2 of about 451 texts.
Has Missing Values?
No
Variable Information
1- 3916 text reviews 2- 5-rating scale: 1: 303 2: 148 3: 418 4: 734 5: 2313 Positive class includes rating stars from 5 to 3 of 3465 total. Negative class include rating stars from 1 to 2 of 451 total.
Dataset Files
File | Size |
---|---|
OCLAR - Opinion Corpus for Lebanese Arabic Reviews.csv | 374 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset opinion_corpus_for_lebanese_arabic_reviews_oclar = fetch_ucirepo(id=499) # data (as pandas dataframes) X = opinion_corpus_for_lebanese_arabic_reviews_oclar.data.features y = opinion_corpus_for_lebanese_arabic_reviews_oclar.data.targets # metadata print(opinion_corpus_for_lebanese_arabic_reviews_oclar.metadata) # variable information print(opinion_corpus_for_lebanese_arabic_reviews_oclar.variables)
Omari, M., Al-Hajj, M., Hammami, N., & Sabra, A. (2019). Opinion Corpus for Lebanese Arabic Reviews (OCLAR) [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5SP66.
Creators
Marwan Omari
Moustafa Al-Hajj
Nacereddine Hammami
Amani Sabra
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.