Mushroom Data Set
Below are papers that cite this data set, with context shown.
Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.
Return to Mushroom data set page.
Manuel Oliveira. Library Release Form Name of Author: Stanley Robson de Medeiros Oliveira Title of Thesis: Data Transformation For Privacy-Preserving Data Mining Degree: Doctor of Philosophy Year this Degree Granted. University of Alberta Library. 2005.
(d o =37). ..............166 C.2 The error produced on the Mushroom dataset (d o =23).............166 C.3 The error produced on the Pumsb dataset (d o =74)...............166 C.4 The error produced on the Connect dataset (d o =43). .............166 C.5 The error produced on
Hyunsoo Kim and Se Hyun Park. Data Reduction in Support Vector Machines by a Kernelized Ionic Interaction Model. SDM. 2004.
time to obtain ten-fold cross validation accuracy. (c) The number of support vectors. All results were drawn against the percentage of selected data points on the UC Irvine 8124 × 22 Mushroom data set. The solid, dashed, and dashdot lines were the results of IoI for the selection threshold values Š t of 0.1, 0.01, and 0.0, respectively. The circle points were the results of KIB2. The star,
Xiaoyong Chai and Li Deng and Qiang Yang and Charles X. Ling. Test-Cost Sensitive Naive Bayes Classification. ICDM. 2004.
attributes Ecoli 6 Breast 9 Heart 8 Thyroid 24 Australia 15 Cars 6 Voting 16 Mushroom 22 Table 2. Datasets used in the experiments We ran a 3-fold cross validation on these data sets. In the experiments, no missing value is assigned in the training examples and for the testing examples, a certain
Daniel J. Lizotte and Omid Madani and Russell Greiner. Budgeted Learning of Naive-Bayes Classifiers. UAI. 2003.
from the UCI Machine Learning Repository [BM98]. These plots show cross validation error (20% of the dataset) on the mushroom and votes datasets of the different policies. Each point is an average of 50 trials where in each trial a random balanced partition of classes was made for training and validation.
Daniel Barbar and Yi Li and Julia Couto. COOLCAT: an entropy-based algorithm for categorical clustering. CIKM. 2002.
used for clustering, but can be loosely used for quality measuring. (Some congressmen ``crossed'' parties to vote.) There are 435 records in the set (267 Democrats and 168 Republicans). ffl mushroom data set The mushroom data set was also obtained from the UCI Repository (). Each record describes the physical characteristics (e.g., odor, shape) of a single mushroom. There is a ''poisonous,'' or
Stephen D. Bay and Michael J. Pazzani. Detecting Group Differences: Mining Contrast Sets. Data Min. Knowl. Discov, 5. 2001.
The Adult Census data contains information extracted from the 1994 CurrentPopulation Survey. There are variables such as age, working class, education, sex, hours worked, salary, etc. Mushroom This data set describes mushrooms and their physical properties such as shape, odor, habitat, etc. Mushroom is not a true observational data set as the examples are not drawn from individual instances but rather
Jinyan Li and Guozhu Dong and Kotagiri Ramamohanarao and Limsoon Wong. DeEPs: A New Instance-based Discovery and Classification System. Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases. 2001.
This concise pattern representation technique greatly saves the cost of DeEPs. ffl Boundary EPs (typically small in number, e.g., 81 in the mushroom data set) are considered in DeEPs' classification. The selected EPs are ``good'' representatives of all EPs occurring in the considered instance. This selection also significantly reduces the number of EPs
Huan Liu and Hongjun Lu and Jie Yao. Toward Multidatabase Mining: Identifying Relevant Databases. IEEE Trans. Knowl. Data Eng, 13. 2001.
(Table 6). We ran C4.5rules  and confirmed that classification rules for enign" cases contain attributes 2, 3, 5, 7 and 9 from all the three data sets. Results of the Mushroom data: This data has 22 attributes. Relief ranks importance order of attributes as: 5, 20, 11, 8, 19, 4, 10, 22, 9, 12, 13, 21, 7, 3, 2, 15, 14, 18, 6, 17, 16, 1. We divide
Jinyan Li and Guozhu Dong and Kotagiri Ramamohanarao. Instance-Based Classification by Emerging Patterns. PKDD. 2000.
substantially reduced since itemsets T `` P i are frequently contained in some other itemsets T `` P j . Then, R p can be viewed as a compressed D p , and Rn a compressed Dn . We use the mushroom dataset to demonstrate this point. The original mushroom data has a volume of 3788 edible training instances, with 22 attributes per instance. The average number of items (or length) of the 3788 processed
Farhad Hussain and Huan Liu and Einoshin Suzuki and Hongjun Lu. Exception Rule Mining with a Relative Interestingness Measure. PAKDD. 2000.
Noise Strong exception Strong exception Fig. 1. Rules in the data 5 Experiments In this section we explain our interesting rules obtained from Japanese credit data and mushroom data . The credit data set has 10 attributes and 125 instances with a binary class. The two types of classes define when a credit is given to a particular person depending on other attribute values. Based on our approach we
Kiri Wagstaff and Claire Cardie. Clustering with Instance-level Constraints. ICML. 2000.
reflects its inherent class structure, i.e. to create one cluster per class. Talavera and B#ejar (1999), for example, use this model to place instances from the mushroom UCI (Blake & Merz, 1998) data set into either a poisonous" or an edible" cluster. We focus here on the latter model and propose the use of constraints that express information about the underlying class structure, thereby enabling
Mark A. Hall and Lloyd A. Smith. Feature Selection for Machine Learning: Comparing a Correlation-Based Filter Approach to the Wrapper. FLAIRS Conference. 1999.
5% level according to a paired two-sided t-test. Similarly, Table 3 shows the results of feature selection for C4.5. Table 2: Accuracy of naive Bayes with feature selection by CFS and the wrapper. Dataset CFS Wrapper All features mushroom 98.53+ 98.86+ 94.75 vote 95.20+ 95.24+ 90.25 vote1 89.51+ 88.95+ 87.20 australian 85.90+ 85.16+ 78.21 lymph 83.92+ 76.00Gamma 82.12 primary-tumor 46.73 42.32Gamma
Jinyan Li and Xiuzhen Zhang and Guozhu Dong and Kotagiri Ramamohanarao and Qun Sun. Efficient Mining of High Confidience Association Rules without Support Thresholds. PKDD. 1999.
we do not enumerate all top rules; we use borders  to succinctly represent them instead. The significance of this representation is highlighted in the experimental results of Mushroom dataset, where there exist a huge number of top rules. In addition to top rules, we also address the problems of mining zero-confidence rules and mining very high (say ###"$%& ) confidence rules with
Seth Bullock and Peter M. Todd. Made to Measure: Ecological Rationality in Structured Environments. Center for Adaptive Behavior and Cognition Max Planck Institute for Human Development. 1999.
and/or its absence to be an indicator of toxicity, and those for which the presence or absence of the cue indicates the opposite. One might expect that since, on average across the Mushroom Problem data set, the presence of each cue tends to indicate edibility, rules of the former kind might be more useful and hence better represented in the set of elite strategies. Fig. 15 shows that this is indeed
Venkatesh Ganti and Johannes Gehrke and Raghu Ramakrishnan. CACTUS - Clustering Categorical Data Using Summaries. KDD. 1999.
1 on which distance functions are not naturally defined. Recently, the problem of clustering categorical data started receiving interest [GKR98, GRS99]. As an example, consider the MUSHROOM dataset in the popular UCI Machine Learning repository [CBM98]. Each tuple in the dataset describes a sample of gilled mushrooms using twenty two categorical attributes. For instance, the cap color
Ismail Taha and Joydeep Ghosh. Symbolic Interpretation of Artificial Neural Networks. IEEE Trans. Knowl. Data Eng, 11. 1999.
that have been used as benchmarks for rule extraction approaches are the Monk , Mushroom  and the DNA promoter  data sets. All three of these data sets inputs are symbolic/discrete by nature. Since we want to test more general problems that may include continuous valued variables, Iris and Breast-Cancer were preferred
Mark A. Hall. Department of Computer Science Hamilton, NewZealand Correlation-based Feature Selection for Machine Learning. Doctor of Philosophy at The University of Waikato. 1999.
The following is a brief description of the datasets. Mushroom (mu) This dataset contains records drawn from The Audubon Society Field Guide to North American Mushrooms [Lin81]. The task is to distinguish edible from poisonous mushrooms on the basis
Huan Liu and Hongjun Lu and Ling Feng and Farhad Hussain. Efficient Search of Reliable Exceptions. PAKDD. 1999.
, and indeed find out some interesting exception patterns. From the mushroom data set, we can achieve more reliable exceptions, compared to the results from . The remainder of the paper is organized as follows: section 2 gives a detailed description of the proposed approach.
Huan Liu and Rudy Setiono. Incremental Feature Selection. Appl. Intell, 9. 1998.
19 35 683 307 376 Parity5+5 2 10 1024 100 100 Vote 2 16 435 300 135 Mushroom 2 22 8125 7125 1000 Led17 10 24 20,000 20,000 - Krvskp 2 36 3196 3196 - Parity Mix 2 20 2 20 10,000 - ffl Mushroom The dataset has a total of 8124 patterns, of which 1000 patterns are randomly selected for testing, the rest are used for training. The data has 22 discrete features. Each feature can have 2 to 10 values. ffl
Robert M French. Pseudo-recurrent connectionist networks: An approach to the "sensitivity-stability" dilemma.. Connection Science. 1997.
A mushroom database (Murphy & Aha, 1992) in which mushrooms were classified as
either edible or poisonous on the basis of 22 technical attributes
Nicholas Howe and Claire Cardie. Examining Locally Varying Weights for Nearest Neighbor Algorithms. ICCBR. 1997.
has 307 designated training instances, and 376 designated test instances. from the benchmark with continuous features were discarded. Also, NetTalk was not used because of its similarity to the NLP datasets described below, and Mushroom was found to be too easy. We also include an artificial task constructed specifically to exhibit feature importance that varies locally at the class level, and several
Huan Liu and Rudy Setiono. A Probabilistic Approach to Feature Selection - A Filter Solution. ICML. 1996.
simplifies the comparison of this work with some published work. These datasets except Mushroom were used in [ John et al., 1994 ] in which comparisons with different methods were described. Nevertheless, the experiments here can alone demonstrate the effectiveness of LVF
Kamal Ali and Michael J. Pazzani. Error Reduction through Learning Multiple Descriptions. Machine Learning, 24. 1996.
class labels. 6. The r 2 's between Er and OE e without the significant error reduction restriction are: 50.7% (Uniform), 33.7% (Bayes), 6.8% (Distribution) and 31.6% (Likelihood). The Mushroom data set causes a problem for the Distribution combination strategy because both the ensemble error and multiple models error are close to 0 so the ratio cannot be reliably estimated. The r 2 for
Guszti Bartfai. VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui. Department of Computer Science PO Box 600. 1996.
A brief survey of related neural network models is also provided. Keywords: Modular networks, Adaptive Resonance Theory, Hierarchical ART, Self-organization, Hierarchical clustering, Mushroom dataset. Publishing Information This paper will appear in a Special Issue of Connection Science on ``Combining Artificial Neural Networks''. 0 Author Information Guszti Bartfai is a lecturer at the
Geoffrey I. Webb. OPUS: An Efficient Admissible Algorithm for Unordered Search. J. Artif. Intell. Res. (JAIR, 3. 1995.
of the search space below a poor choice of node can do much to minimize the damage done by that poor choice, even when there is no backtracking as is the case for depth-first search. For five data sets (House Votes 84, Lymphography, Mushroom Primary Tumor and Soybean Large), disabling optimistic pruning has little eŽect under best-first search. Disabling optimistic pruning always has large eŽect
Chotirat Ann and Dimitrios Gunopulos. Scaling up the Naive Bayesian Classifier: Using Decision Trees for Feature Selection. Computer Science Department University of California.
554 instances, 6 attributes, 2 classes. Attributes selected by SBC = 4. Mushroom 90 92 94 96 98 100 10 20 30 40 50 60 70 80 90 99 Training Data (%) Accuracy (%) NBC SBC C4.5 Figure 6. Mushroom dataset. 8,124 instances, 22 attributes, 2 classes. Attributes selected by SBC = 6. Pima Indians Diabetes 60 65 70 75 80 85 10203040506070809099 Training Data (%) Accuracy (%) NBC SBC C4.5 Figure 7.
Eric P. Kasten and Philip K. McKinley. MESO: Perceptual Memory to Support Online Learning in Adaptive Software. Proceedings of the Third International Conference on Development and Learning (ICDL.
contains continuous features, measuring features such as elevation or slope, and binary values indicating whether a pattern is a particular soil type. However, the Mushroom data set consists entirely of nominal values encoded as alpha characters converted to their ASCII equivalent for processing by MESO. In contrast, the ATT Faces data set comprises image pixel values of human
Stefan R uping. A Simple Method For Estimating Conditional Probabilities For SVMs. CS Department, AI Unit Dortmund University.
from the UCI Repository  (covtype, diabetes, digits, digits, ionosphere, liver, mushroom promoters) and 4 other real-world data sets: a business cycle analysis problem (business), an analysis of a direct mailing application (directmailing), a data set from a life insurance company (insurance) and intensive care patient
Josep Roure Alcobe. Incremental Hill-Climbing Search Applied to Bayesian Network Structure Learning. Escola Universitria Politcnica de Mataro.
approaches save a significantly amount of CPU clock ticks while the quality of the final Bayesian networks is very close to the ones obtained with the batch approaches. See also that the Mushroom dataset is generally the most diącult to learn incrementally in the sense that incremental algorithms obtain the lowest time gain. This may be due to the fact that there are many arcs that bring similar
Wl odzisl and Rafal Adamczak and Krzysztof Grabczewski and Grzegorz Zal. A hybrid method for extraction of logical rules from data. Department of Computer Methods, Nicholas Copernicus University.
be replaced by "population=clustered". This is the simplest systematic logical description (some of these rules have probably been also found by the RULEX and TREX algorithms ) of the mushroom dataset that we know of and therefore should be used as benchmark for other rule extraction methods. We have also solved the three monk problems . For the Monks 1 problem one additional neuron handling
Jinyan Li and Kotagiri Ramamohanarao and Guozhu Dong. ICML2000 The Space of Jumping Emerging Patterns and Its Incremental Maintenance Algorithms. Department of Computer Science and Software Engineering, The University of Melbourne, Parkville.
is as yet unsolved. A naive maintenance method is to take a border difference operation to discover the border of the JEP space with respect to D p and (Dn Gamma Delta n ). Table 1. Properties of data sets. DATA SETS #INSTANCES #ATTRI #ITEMS MUSHROOM 4208(+), 3916 (-) 22 125 PIMA 268(+), 500 (-) 8 17 TIC-TAC-TOE 626(+), 332 (-) 9 27 NURSERY 4320(+), 8640(-) 8 27 5. Experimental Results We choose four
Wl/odzisl/aw Duch and Rafal Adamczak and Krzysztof Grabczewski. Extraction of crisp logical rules using constrained backpropagation networks. Department of Computer Methods, Nicholas Copernicus University.
and four antecedents. We have also tried to derive rules using only 10% of cases for training, achieving identical results. This is the simplest systematic logical description of the mushroom dataset that we know of, although some of these rules have probably been also found by RULEX and TREX algorithms . Analysis of the graph representing possible contributions of the relevant attributes to
Wl odzisl/aw Duch and Rudy Setiono and Jacek M. Zurada. Computational intelligence methods for rule-based data understanding.
activation of the first neuron. Adding a second neuron and training it on the remaining cases generates two additional rules, R 3 handling 40 cases and R 4 handling only 8 cases. For the mushroom dataset, SSV tree has found a 100% accurate solution that can be described as four logical rules using only five attributes. The first two rules are identical to the rules given above, but the remaining two
C. Titus Brown and Harry W. Bullen and Sean P. Kelly and Robert K. Xiao and Steven G. Satterfield and John G. Hagedorn and Judith E. Devaney. Visualization and Data Mining in an 3D Immersive Environment: Summer Project 2003.
whole museum at distance. 54 Figure 4.26: Glass dataset: whole museum from above. 55 4.15 Mushroom The mushroom data set was analysed by Sean kelly. The mushroom dataset reflects all the problems found in the tic-tac-toe data, but on a larger scale. Once
Daniel J. Lizotte. Library Release Form Name of Author. Budgeted Learning of Naive Bayes Classifiers.
from the UCI Machine Learning Repository [BM98]. These plots show averaged validation error of the policies on a holdout set (20% of the dataset) on the mushroom nursery, and votes datasets. Each point is an average of 50 trials where in each trial a random balanced partition of classes was made for training and validation. The five-fold
David R. Musicant. DATA MINING VIA MATHEMATICAL PROGRAMMING AND MACHINE LEARNING. Doctor of Philosophy (Computer Sciences) UNIVERSITY.
of 600 points with 6 features, where each class contained 300 points. . The mushroom dataset is a two class dataset which contains a number of categorical attributes. We transformed each categorical attribute into a series of binary attributes, one attribute for each distinct value. For
Sherrie L. W and Zijian Zheng. A BENCHMARK FOR CLASSIFIER LEARNING. Basser Department of Computer Science The University of Sydney.
e.g. Promoter (106) and Lymphography (148) ffl Medium (between 210 and 3170), e.g. Diabetes (768) and Thyroid (3163) ffl Large (more than 3170), e.g. NetTalk (Phoneme) (5438) and Mushroom (8124) 8. Dataset density (3 values): Usually a classifier learning algorithm can learn a more accurate theory from a larger number of training examples than from fewer examples. However, because different domains
Anthony Robins and Marcus Frean. Learning and generalisation in a stable network. Computer Science, The University of Otago.
[Robins, 1996]; a classification task using the Mushroom data set [French, 1997]; and an alphanumeric character set using a Hopfield type network [Robins and McCallum, 1997] 2 . 2 The Iris and Mushroom data sets are taken from [Murphy and Aha, 1994]. 1 0 1 Input 1
Rudy Setiono. Extracting M-of-N Rules from Trained Neural Networks. School of Computing National University of Singapore.
or (jacket-color is not blue and body-shape is not octagon). Each input attribute value is coded as -1 or 1. Hence, 17 input units are required. The total number of patterns in each of the three datasets is 432. 3. The mushroom classification dataset . The dataset consists of 8124 samples, each of which is described by 22 nominal attributes that are the characteristics of species of mushroom.
Jos'e L. Balc'azar. Rules with Bounded Negations and the Coverage Inference Scheme. Dept. LSI, UPC.
being able to generate exactly the same set of rules that would be found at the chosen support and confidence. We describe next such a notion of cover, due to [CS], who applied it to a synthetic dataset and to the Mushroom database. We describe the results of employing this cover strategy on the databases Car (which is close to synthetic) and Contraceptive Method Choice, with real-world data coming
Mehmet Dalkilic and Arijit Sengupta. A Logic-theoretic classifier called Circle. School of Informatics Center for Genomics and BioInformatics Indiana University.
like contact-lenses, and weather, as well as large data sets like monks, mushroom and Zoo. As an example of the performance improvement, while full Circle took over an hour to terminate using the Zoo data set, the Randomized Circle with 8 attributes per
Daniel J. Lizotte and Omid Madani and Russell Greiner. Budgeted Learning, Part II: The Na#ve-Bayes Case. Department of Computing Science University of Alberta.
from the UCI Machine Learning Repository [BM98]. These plots show cross validation error (20% of the dataset) on the mushroom and votes datasets of the different policies. Each point is an average of 50 trials where in each trial a random balanced partition of classes were made for training and validation.
Ron Kohavi and Barry G. Becker and Dan Sommerfield. Improving Simple Bayes. Data Mining and Visualization Group Silicon Graphics, Inc.
that had significant differences. We can see that frequency counts (Nomatches-0) performs generally worse than No-matches-PC, except on the cars and mushroom datasets where it performs significantly better. LaplaceGamma m seems to take the best of both worlds. It tracks No-matches-PC on most datasets, except cars and mushroom where it tracks No-matches-0 well.
Wl odzisl/aw Duch and Rafal Adamczak and Krzysztof Grabczewski and Norbert Jankowski. Control and Cybernetics. Department of Computer Methods, Nicholas Copernicus University.
classifiers used in such problems, but they have an opinion of being opaque black boxes. Several neural methods have been compared experimentally on the mushroom and the 3 Monk problems benchmark datasets (Andrews et al. 1995), and recently comparison with some machine learning methods has been given (Duch et al. 2000). There is no reason why a simple classification model based on logical rules
Huan Liu. A Family of Efficient Rule Generators. Department of Information Systems and Computer Science National University of Singapore.
are chosen for further experiments from the machine learning databases at University of California, Irvine . They are: ffl Mushroom The training and testing datasets contains 7124 and 1000 instances respectively. The 1000 instances in the testing set are randomly selected. The rest are used for training. The data has 22 discrete attributes. Each attribute can
Shi Zhong and Weiyu Tang and Taghi M. Khoshgoftaar. Boosted Noise Filters for Identifying Mislabeled Data. Department of Computer Science and Engineering Florida Atlantic University.
noise Recall Precision BF BBF-1 BBF-12 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 40% noise Recall Precision BFBBF-3 BBF-36 (e) (f) Figure 8. Noise detection results on the mushroom dataset with six different noise levels: (a) 5%; (b) 10%; (c) 15%; (d) 25%; (e) 35%; and (f) 40%. 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5% noise Recall Precision BF BBF-2 BBF-21 0 0.2