Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Covertype Data Set

Below are papers that cite this data set, with context shown. Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.

Return to Covertype data set page.


Joao Gama and Ricardo Rocha and Pedro Medas. Accurate decision trees for mining high-speed data streams. KDD. 2003.

surprising. They indicate that sub-optimal decisions could contribute to a bias reduction. Other Results on Real data We have done some experiments on real data. We have used the Forest CoverType dataset from the UCI KDD archive. The goal is to predict the forest cover type from cartographic variables. The problem is defined by 54 variables of diŽerent types: continuous and categorical. The dataset


Nikunj C. Oza and Stuart J. Russell. Experimental comparisons of online and batch versions of bagging and boosting. KDD. 2001.

[2], two datasets (Census Income and Forest Covertype from the UCI KDD archive [1], and three synthetic datasets. We give their sizes and numbers of attributes and classes in Table 1. All three of our synthetic


Chris Giannella and Bassem Sayrafi. An Information Theoretic Histogram for Single Dimensional Selectivity Estimation. Department of Computer Science, Indiana University Bloomington.

were obtained from the UCI KDD archive [7]. The forestcov4 and forestcov9 datasets were found under the "Forest CoverType heading, covtype.data file -- attributes four and nine, respectively. The attributes represent various geographic measurements. The cup199 and cup472


Johannes Furnkranz. Round Robin Rule Learning. Austrian Research Institute for Artificial Intelligence.

6) Ripper. The first five lines are total run-times, i.e., training and test time, while the cross-validated results report training time only. We failed to measure the run-times for the covertype data set, where the situation was complicated because of the large test set, which had to be split into several pieces for the Ripper-based algorithms. The last line shows the average of the 17


Zoran Obradovic and Slobodan Vucetic. Challenges in Scientific Data Mining: Heterogeneous, Biased, and Large Samples. Center for Information Science and Technology Temple University.

levels of all attributes in an efficient manner. 1.4.2 Application 6: Reduction of spatially correlated data We performed a number of down-sampling and quantization experiments [76] on several large data sets including Covertype Data Set. This set is currently one of the largest databases in the UCI Database Repository [50] containing 581,012 examples with 54 attributes and 7 target classes and


Arto Klami and Samuel Kaski and Ty n ohjaaja and Janne Sinkkonen. HELSINKI UNIVERSITY OF TECHNOLOGY Department of Engineering Physics and Mathematics Arto Klami Regularized Discriminative Clustering. Regularized Discriminative Clustering.

Dimensions Classes Samples TIMIT phoneme data 12 41 99983 Letter Recognition Data 16 26 20000 Forest CoverType data 10 7 100000 6.2.1 Data sets The data sets used for testing were chosen to be suitable for the task of discriminative clustering. The feature vectors are continuous and real-valued, and a categorical variable to be used as


Return to Covertype data set page.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML