Covertype Data Set
Below are papers that cite this data set, with context shown.
Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.
Return to Covertype data set page.
Joao Gama and Ricardo Rocha and Pedro Medas. Accurate decision trees for mining high-speed data streams. KDD. 2003.
surprising. They indicate that sub-optimal decisions could contribute to a bias reduction. Other Results on Real data We have done some experiments on real data. We have used the Forest CoverType dataset from the UCI KDD archive. The goal is to predict the forest cover type from cartographic variables. The problem is defined by 54 variables of diŽerent types: continuous and categorical. The dataset
Nikunj C. Oza and Stuart J. Russell. Experimental comparisons of online and batch versions of bagging and boosting. KDD. 2001.
, two datasets (Census Income and Forest Covertype from the UCI KDD archive , and three synthetic datasets. We give their sizes and numbers of attributes and classes in Table 1. All three of our synthetic
Arto Klami and Samuel Kaski and Ty n ohjaaja and Janne Sinkkonen. HELSINKI UNIVERSITY OF TECHNOLOGY Department of Engineering Physics and Mathematics Arto Klami Regularized Discriminative Clustering. Regularized Discriminative Clustering.
Dimensions Classes Samples TIMIT phoneme data 12 41 99983 Letter Recognition Data 16 26 20000 Forest CoverType data 10 7 100000 6.2.1 Data sets The data sets used for testing were chosen to be suitable for the task of discriminative clustering. The feature vectors are continuous and real-valued, and a categorical variable to be used as
Chris Giannella and Bassem Sayrafi. An Information Theoretic Histogram for Single Dimensional Selectivity Estimation. Department of Computer Science, Indiana University Bloomington.
were obtained from the UCI KDD archive . The forestcov4 and forestcov9 datasets were found under the "Forest CoverType heading, covtype.data file -- attributes four and nine, respectively. The attributes represent various geographic measurements. The cup199 and cup472
Johannes Furnkranz. Round Robin Rule Learning. Austrian Research Institute for Artificial Intelligence.
6) Ripper. The first five lines are total run-times, i.e., training and test time, while the cross-validated results report training time only. We failed to measure the run-times for the covertype data set, where the situation was complicated because of the large test set, which had to be split into several pieces for the Ripper-based algorithms. The last line shows the average of the 17
Zoran Obradovic and Slobodan Vucetic. Challenges in Scientific Data Mining: Heterogeneous, Biased, and Large Samples. Center for Information Science and Technology Temple University.
levels of all attributes in an efficient manner. 1.4.2 Application 6: Reduction of spatially correlated data We performed a number of down-sampling and quantization experiments  on several large data sets including Covertype Data Set. This set is currently one of the largest databases in the UCI Database Repository  containing 581,012 examples with 54 attributes and 7 target classes and