Letter Recognition Data Set
Below are papers that cite this data set, with context shown.
Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.
Return to Letter Recognition data set page.
Jaakko Peltonen and Arto Klami and Samuel Kaski. Improved Learning of Riemannian Metrics for Exploratory Analysis. Improved Learning of Riemannian Metrics for Exploratory Analysis. Neural Networks. 2004.
(Table 5). The graph approximation further improved the results on Landsat and Letter Recognition data sets; on the other two sets the difference between the Sammon-L variants was insignificant. Table 5 Indirect measure of the goodness for the Sammon's mappings. Average percentage of correct
Xiaoli Z. Fern and Carla Brodley. Cluster Ensembles for High Dimensional Clustering: An Empirical Study. Journal of Machine Learning Research n, a. 2004.
High resolution computed tomography lung image data Dy et al. (1999) chart Synthetically generated control chart time series UCI KDD archive (Hettich and Bay, 1999) isolet6 Spoken letter recognition data set (6 letters only) UCI ML archive mfeat Handwritten digits represented by Fourier coefficients (Blake and Merz, 1998) satimage StatLog Satellite image data set (training set) segmentation Image
Giorgio Valentini. Ensemble methods based on bias--variance analysis Theses Series DISI-TH-2003. Dipartimento di Informatica e Scienze dell'Informazione . 2003.
from UCI: we consider here only letter B versus letter R, taken from the letter recognition data set. The 16 attributes are integer values that refer to di®erent features of the letters. We used also a version of Letter-Two with 20 % added classification noise (Letter-Two with added noise data
Dmitry Pavlov and Alexandrin Popescul and David M. Pennock and Lyle H. Ungar. Mixtures of Conditional Maximum Entropy Models. ICML. 2003.
actual time complexity strongly depends on the sparsity of the data. By looking only at the complexity terms of Table 1, one could expect that time performance on the Letter Recognition and Cover data sets would be roughly the same. However, the Cover data set is substantially more sparse and this results in an order of magnitude decrease in actual training time difference. Overall, we conclude that
Kristin P. Bennett and Ayhan Demiriz and Richard Maclin. Exploiting unlabeled data in ensemble methods. KDD. 2002.
ufraction 0.10 superv ufraction 0.10 semi_sup ufraction 0.25 superv ufraction 0.25 semi_sup ufraction 0.5 superv ufraction 0.5 semi_sup Figure 1: Neural network results for the Letter Recognition dataset using networks with 5, 10 and 20 hidden units. Results shown are for 10, 25, and 50 percent of the data marked as unlabeled (ufractions of 0.10, 0.25, and 0.5) for AdaBoost (superv) and ASSEMBLE
Stephen D. Bay. Nearest neighbor classification from multiple feature subsets. Intell. Data Anal, 3. 1999.
with n models will usually require n times the memory of a single classifier. For many problems this amount of memory may not be significant, but Dietterich  notes that on the Letter Recognition dataset (available from the UCI repository) an ensemble of 200 decision trees obtained 100% accuracy but required 59 megabytes of storage! The entire dataset was only 712 kilobytes. 4 Experiments
Thomas G. Dietterich. Approximate Statistical Test For Comparing Supervised Classification Learning Algorithms. Neural Computation, 10. 1998.
1 (Quinlan, 1993) and the first nearest-neighbor (NN) algorithm (Dasarathy, 1991). We then selected three difficult problems: the EXP6 problem developed by Kong (1995), the Letter Recognition data set (Frey & Slate, 1991), and the Pima Indians Diabetes Task (Merz & Murphy, 1996). Of course, C4.5 and NN do not have the same performance on these data sets. In EXP6 and Letter Recognition, NN
Georgios Paliouras and David S. Brée. The Effect of Numeric Features on the Scalability of Inductive Learning Programs. ECML. 1995.
6000 set size (instances) 1 10 100 1000 10000 100000 CPU Time (sec.) Computational Performance Letter Recognition Set C4.5 PLS1 CN2 AQ15 Figure 5: Scalability Results, using the Letter Recognition data set. 60 600 6000 60000 set size (instances) 0 1 2 3 4 Rate of increase Rate of increase of CPU-time consumption Letter Recognition Set (n 2 ) (n) C4.5 PLS1 CN2 AQ15 Figure 6: Letter Recognition Set: The
Thomas G. Dietterich and Ghulum Bakiri. Solving Multiclass Learning Problems via Error-Correcting Output Codes. CoRR, csAI/9501101. 1995.
round, but instead performs jamming (i.e., forcing the lowest order bit to 1 when low order bits are lost due to shifting or multiplication). On the speech recognition, letter recognition and vowel data sets, we employed the opt system distributed by Oregon Graduate Institute (Barnard & Cole, 1989). This implements the conjugate gradient algorithm and updates the gradient after each complete pass
Shailesh Kumar and Melba Crawford and Joydeep Ghosh. A versatile framework for labelling imagery with a large number of classes. Department of Electrical and Computer Engineering.
for distinguishing some other pair of classes. Section 2 describes the Bayesian pairwise classifier (BPC) architecture with feature selection. Experimental results on the 26 class letter recognition dataset and, more importantly, the 11 class remote sensing dataset are presented in section 3. 2 Pairwise Classifier Architecture A C class problem is first decomposed into a set of C 2 # two class problems
Amund Tveit. Empirical Comparison of Accuracy and Performance for the MIPSVM classifier with Existing Classifiers. Division of Intelligent Systems Department of Computer and Information Science, Norwegian University of Science and Technology.
As we can see from the results in figure 1, MIPSVM performs comparably well when it comes to classification accuracy for the Waveform and Image Segment datasets. For the Letter Recognition dataset it performs considerably worse than the other classifiers. This is likely to be caused by that MIPSVM doesn't have any balancing mechanisms one-against-the-rest
Hirotaka Inoue and Hiroyuki Narihisa. Incremental Learning with Self-Organizing Neural Grove. Department of Electrical Engineering and Information Science, Kure National College of Technology.
Results We investigate the relation between the number of training data and the classification accuracy, the number of nodes, and the computation time of SONG with bagging for letter recognition dataset in the UCI repository . 2 Hirotaka Inoue and Hiroyuki Narihisa 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Classification accuracy (%) # of N K=1 K=3 K=5
Jaakko Peltonen and Arto Klami and Samuel Kaski. Learning Metrics for Information Visualization. Neural Networks Research Centre Helsinki University of Technology.
(Table 3). Computing accurate global distances with the graph search (column 'Graph' in Table 3) further improved performance significantly on the Landsat and Letter recognition data sets. On the two other sets the difference between the Sammon-L variants was not significant. The difference between Sammon-L with the graph approximation and Sammon-E is illustrated in Figure 2 on
Adil M. Bagirov and Julien Ugon. An algorithm for computation of piecewise linear function separating two sets. CIAO, School of Information Technology and Mathematical Sciences, The University of Ballarat.
to be known. In further research some methods to find automatically this number will 19 Table 2: Results of numerical experiments with Shuttle control, Letter recognition and Landsat satellite image datasets Training Test |I| |J i | a 2c a mc a 2c a mc fct eval DG eval Shuttle control dataset 1 1 97.61 97.22 97.53 97.00 925 615 2 1 99.44 97.56 99.41 97.42 2148 1676 3 1 99.61 97.57 99.59 97.50 1474 968
Miguel Moreira and Alain Hertz and Eddy Mayoraz. Data binarization by discriminant elimination. Proceedings of the ICML-99 Workshop: From Machine Learning to.
the end, that is, to test the redundancy of all the discriminants in the initial set. Figure 3 shows the evolution of the elimination process with the conflict-based weighting method for four of the data sets. In the case of Letter Recognition for example, the algorithm starts with 234 discriminants and takes almost 3500 seconds to find a final set with 59. However, after 1000 sec. only 69
Arto Klami and Samuel Kaski and Ty n ohjaaja and Janne Sinkkonen. HELSINKI UNIVERSITY OF TECHNOLOGY Department of Engineering Physics and Mathematics Arto Klami Regularized Discriminative Clustering. Regularized Discriminative Clustering.
runs were selected. The model was then re-trained with the whole training data to produce the final clustering. The value of M was five in Letter recognition and three for the other two data sets. I had to keep the number of candidate values, from which the best value was selected, quite small because of computational reasons. Preliminary runs were used to select a set of reasonable values