Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

YouTube Multiview Video Games Dataset Data Set
Download: Data Folder, Data Set Description

Abstract: This dataset contains about 120k instances, each described by 13 feature types, with class information, specially useful for exploring multiview topics (cotraining, ensembles, clustering,..).

Data Set Characteristics:  

Multivariate, Text

Number of Instances:

120000

Area:

Computer

Attribute Characteristics:

Integer, Real

Number of Attributes:

1000000

Date Donated

2013-10-16

Associated Tasks:

Classification, Clustering

Missing Values?

Yes

Number of Web Hits:

20879


Source:


Omid Madani , madani '@' google.com, Google Inc.


Data Set Information:

Please see the README for the details on the data organization, and so on.


Attribute Information:

Please see the README.


Relevant Papers:

[Provide references to papers that have cited this data set in the past (if any).]

Our recent work used a close version of this dataset:

On Using Nearly-Independent Feature Families for High Precision and Confidence, in Machine Learning Journal, 2013 (please see the citation request) and an earlier version in Asian Conference on Machine Learning (ACML 2012):

On Using Nearly-Independent Feature Families for High Precision and Confidence. O. Madani, M. Georg, and D. Ross. ACML 2012.



Citation Request:


Please cite the following (also specified in the README):

@article{madaniEtAl2013MLJ,
title= {On Using Nearly-Independent Feature Families for High Precision and Confidence}
author = {Omid Madani and Manfred Georg and David A. Ross},
journal = {Machine Learning},
year = {2013},
volume = {92},
pages = {457-477},
note = {published online 30 May 2013, [Web Link]},
}


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML