
COREVQA
Donated on 8/6/2025
Recently, many benchmarks and datasets have been developed to evaluate Vision-Language Models (VLMs) using visual question answering (VQA) pairs, and models have shown significant accuracy improvements. However, these benchmarks rarely test the model's ability to accurately complete visual entailment, for instance, accepting or refuting a hypothesis based on the image. To address this, we propose COREVQA (Crowd Observations and Reasoning Entailment), a benchmark of 5608 image and synthetically generated true/false statement pairs, with images derived from the CrowdHuman dataset, to provoke visual entailment reasoning on challenging crowded images. Our results show that even the top-performing VLMs achieve accuracy below 80%, with other models performing substantially worse (39.98%-69.95%). This significant performance gap reveals key limitations in VLMs’ ability to reason over certain types of image–question pairs in crowded scenes.
Dataset Characteristics
Tabular, Text, Image
Subject Area
Computer Science
Associated Tasks
Other
Feature Type
Real, Categorical
# Instances
5608
# Features
-
Dataset Information
Has Missing Values?
No
Introductory Paper
By Ishant Chintapatla, Kazuma Choji, Naaisha Agarwal, Andrew Lin, Hannah You, Charles Duong, Kevin Zhu, Sean O'Brien, Vasu Sharma. 2025
Published in ICML
Variable Information
Upload images from Hugging Faces repo: https://huggingface.co/datasets/COREVQA2025/COREVQA
Dataset Files
| File | Size |
|---|---|
| COREVQA_data.csv | 1.1 MB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset corevqa = fetch_ucirepo(id=1198) # data (as pandas dataframes) X = corevqa.data.features y = corevqa.data.targets # metadata print(corevqa.metadata) # variable information print(corevqa.variables)
Chintapatla, I. (2025). COREVQA [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5XC83.
Keywords
Creators
Ishant Chintapatla
ishantyunay@gmail.com
Westmont High School
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.