Nomao
Donated on 7/3/2012
Nomao collects data about places (name, phone, localization...) from many sources. Deduplication consists in detecting what data refer to the same place. Instances in the dataset compare 2 spots.
Dataset Characteristics
Univariate
Subject Area
Computer Science
Associated Tasks
Classification
Feature Type
Real
# Instances
34465
# Features
-
Dataset Information
Additional Information
The dataset has been enriched during the Nomao Challenge: http://www.nomao.com/labs/challenge organized along with the ALRA workshop (Active Learning in Real-world Applications): http://www.nomao.com/labs/alra held at the ECML-PKDD 2012 conference.
Has Missing Values?
Yes
Variables Table
Variable Name | Role | Type | Description | Units | Missing Values |
---|---|---|---|---|---|
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no | |||||
no |
0 to 10 of 120
Additional Variable Information
120 attributes: 89 continuous, 31 nominal (including the attributes 'label' and 'id').
Dataset Files
File | Size |
---|---|
Nomao/Nomao.data | 13.7 MB |
Nomao/Nomao.names | 8.1 KB |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset nomao = fetch_ucirepo(id=227) # data (as pandas dataframes) X = nomao.data.features y = nomao.data.targets # metadata print(nomao.metadata) # variable information print(nomao.variables)
Candillier, L. & Lemaire, V. (2012). Nomao [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C53G79.
Creators
Laurent Candillier
Vincent Lemaire
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.