Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

LED Display Domain Data Set

Below are papers that cite this data set, with context shown. Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.

Return to LED Display Domain data set page.


Tim Leunig and D. Stott Parker. Empirical comparisons of various voting methods in bagging. KDD. 2003.

that involve more than two classes. Voting methods that consider preference among classes can therefore gain an advantage over methods that do not, such as plurality. 3.4 Noise matters The led datasets in Table 3, for example, shows great sensitivity of the voting methods to noise. The amount of noise in led datasets increases from 0% to 50%. One somewhat surprising outcome of the experiments,


Joao Gama and Ricardo Rocha and Pedro Medas. Accurate decision trees for mining high-speed data streams. KDD. 2003.

when classifying test examples: classifying using the majority class (VFDTcMC) and classifying using naive Bayes (VFDTcNB) at leaves. The experimental work has been done using the Waveform and LED datasets These are well known artificial datasets. We have used the two versions of the Waveform dataset available at the UCI repository [1]. Both versions are problems with three classes. The first


Xavier Llor and David E. Goldberg. Minimal Achievable Error in the LED. Illinois Genetic Algorithms Laboratory University of Illinois at Urbana-Champaign. 2002.

S. Mathews Ave, Urbana, IL 61801 {llora@illigal.ge.uiuc.edu, deg@uiuc.edu} Abstract This paper presents a theoretical model to predict the minimal achievable error, given a noise ratio #, in the LED data set problem. The motivation for developing this theoretical model is to understand and explain some of the results that different systems achieve when they solve the LED problem. Moreover, given a new


Xavier Llor and David E. Goldberg and Ivan Traus and Ester Bernad i Mansilla. Accuracy, Parsimony, and Generality in Evolutionary Learning Systems via Multiobjective Selection. IWLCS. 2002.

(mux and led). Initially, we used a version of the led data set free of noise, leaving noise considerations for the next subsection. In order to identify the optimal Pareto front, we analyze first the optimal solutions that should be obtained in each problem.


Huan Liu and Rudy Setiono. Incremental Feature Selection. Appl. Intell, 9. 1998.

60 80 100 120 140 160 180 200 220 0.00 20.00 40.00 60.00 80.00 100.00 240 CPU time Percent of training samples LED dataset CPU time Percent of training samples 100 200 300 400 500 600 700 800 900 1000 0.00 20.00 40.00 60.00 80.00 100.00 Figure 1: Average CPU time (seconds). The result for 100% data is used as the


Kamal Ali and Michael J. Pazzani. Error Reduction through Learning Multiple Descriptions. Machine Learning, 24. 1996.

also experienced significant reduction with the error being halved (for DNA this represented an increase in accuracy from 67.9% to 86.8%!). The error reduction is least for the noisy KRK and LED data sets and for the presumably noisy medical diagnosis data sets. Eighty percent of the data sets which scored unimpressive error ratios (above 0.8) were noisy data sets. This finding is further explored


Vikas Sindhwani and P. Bhattacharya and Subrata Rakshit. Information Theoretic Feature Crediting in Multiclass Support Vector Machines.

1. The relevant features are very sharply identified. Feature 1 gets maximum credit on account of being relevant for the most informative SVM, in this case decided purely by the input bias. LED Dataset This 10-class dataset, drawn from the UCI repository, consists of 200 training and 500 test examples of 24 binary-valued features each. The first 7 features are relevant, and correspond to LEDs on a


Maria Salamo and Elisabet Golobardes. Analysing Rough Sets weighting methods for Case-Based Reasoning Systems. Enginyeria i Arquitectura La Salle.

averaged over stratified ten-fold cross-validation runs, with their corresponding standard deviations. To study the performance we use a paired one-sided t-test on these runs, except for the LED dataset which was run using hold-out with a training set of 2000 instances and a test set of 4000 instances. 5.2 Experimental analysis of weighting methods Table 2 shows the experimental results for each


Ramon Sangesa and Ulises Cortes. Possibilistic Conditional Dependency, Similarity and Information Measures: an application to causal network recovery. Departament de Llenguatges i Sistemes Informtics Departament de Llenguatges i Sistemes Informtics Technical University of Catalonia Technical University of Catalonia.

from the UCI Machine Learning Database Repository [16]: ALARM [2], and LED [11]. It has also been applied to data coming from a real sensor measurements of a Wastewater Treatment Plant [22]. The LED dataset represents a faulty LED device where a certain button (a variable with eight values indicating which display to set on) should set a led device ON. Some of the devices react incorrectly to this


Return to LED Display Domain data set page.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML