Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact

Repository Web            Google
View ALL Data Sets

SIFT10M Data Set
Download: Data Folder, Data Set Description

Abstract: In SIFT10M, each data point is a SIFT feature which is extracted from Caltech-256 by the open source VLFeat library. The corresponding patches of the SIFT features are provided.

Data Set Characteristics:  


Number of Instances:




Attribute Characteristics:


Number of Attributes:


Date Donated


Associated Tasks:


Missing Values?


Number of Web Hits:



Xiping Fu, Brendan McCane, Steven Mills, Michael Albert and Lech Szymanski
Department of Computer Science, University of Otago, Dunedin, New Zealand
{xiping, mccane, steven, malbert, lechszym}

Data Set Information:

In SIFT10M, the titles of the png files indicate the columns position of the SIFT features. This data set has been used for evaluating the approximate nearest neighbour search methods. The patches can be used for visualisation purpose and helps for analysing the performance of the corresponding approximate nearest neighbour search methods.

Attribute Information:

Each SIFT feature is a 128D column, and the corresponding patch is saved in 41*41 png format. The png files are compressed into 307 tar files for downloading.

Relevant Papers:

Xiping Fu, Brendan McCane, Steven Mills, and Michael Albert, 'NOKMeans: Non-orthogonal K-means hashing', in Asian Conference on Computer Vision (ACCV14). pp 162--177.
Xiping Fu, Brendan McCane, Steven Mills, Michael Albert, and Lech Szymanski, 'Auto-JacoBin: Auto-encoder Jacobian Binary Hashing', submitted to PAMI.

Citation Request:

Please refer to the Machine Learning Repository's citation policy

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML