SIFT10M

Donated on 2/22/2016

In SIFT10M, each data point is a SIFT feature which is extracted from Caltech-256 by the open source VLFeat library. The corresponding patches of the SIFT features are provided.

Dataset Characteristics

Multivariate

Subject Area

Computer Science

Associated Tasks

Causal-Discovery

Feature Type

Integer

# Instances

11164866

# Features

Dataset Information

Additional Information

In SIFT10M, the titles of the png files indicate the columns position of the SIFT features. This data set has been used for evaluating the approximate nearest neighbour search methods. The patches can be used for visualisation purpose and helps for analysing the performance of the corresponding approximate nearest neighbour search methods.

Has Missing Values?

Variables Table

Variable Name	Role	Type	Description	Units	Missing Values
					no
					no
					no
					no
					no
					no
					no
					no
					no
					no

Rows per page

0 to 10 of 128

Additional Variable Information

Each SIFT feature is a 128D column, and the corresponding patch is saved in 41*41 png format. The png files are compressed into 307 tar files for downloading.

Dataset Files

File	Size
SIFT10M.tar.gz	7.3 GB
README.txt	1.3 KB

Download (7.3 GB)

0 citations

3256 views

Creators

Xiping Fu

Brendan McCane

Steven Mills

Michael Albert

Lech Szymanski

DOI

10.24432/C5S603

License

This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.