Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact

Repository Web            Google
View ALL Data Sets

Heart Disease Data Set

Below are papers that cite this data set, with context shown. Papers were automatically harvested and associated with this data set, in collaboration with

Return to Heart Disease data set page.

Zhi-Hua Zhou and Yuan Jiang. NeC4.5: Neural Ensemble Based C4.5. IEEE Trans. Knowl. Data Eng, 16. 2004.

are tabulated in Table III. Table III shows that the generalization ability of NeC4.5 with µ = 100% is better than that of C4.5. In detail, pairwise two-tailed t-tests indicate that there are ten data sets (balance, breast, cleveland, credit, heart iris, vehicle, waveform21, waveform40, and wine) where NeC4.5 with µ = 100% is significantly more accurate than C4.5, while there is no significant

Remco R. Bouckaert and Eibe Frank. Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms. PAKDD. 2004.

perform differently in 19 out of 27 cases. For some rows, the test consistently indicates no difference between any two of the three schemes, in particular for the iris and Hungarian heart disease datasets. However, most rows contain at least one cell where the outcomes of the test are not consistent. The row labeled "consistent" at the bottom of the table lists the number of datasets for which all

Xiaoyong Chai and Li Deng and Qiang Yang and Charles X. Ling. Test-Cost Sensitive Naive Bayes Classification. ICDM. 2004.

attributes of datasets attributes Ecoli 6 Breast 9 Heart 8 Thyroid 24 Australia 15 Cars 6 Voting 16 Mushroom 22 Table 2. Datasets used in the experiments We ran a 3-fold cross validation on these data sets. In the

Gavin Brown. Diversity in Neural Network Ensembles. The University of Birmingham. 2004.

from the UCI repository (699 patterns), and the Heart disease dataset from Statlog (270 patterns). An ensemble consisting of two networks, each with five hidden nodes, was trained using NC. We use 5-fold cross-validation, and 40 trials from uniform random weights in

Kaizhu Huang and Haiqin Yang and Irwin King and Michael R. Lyu and Laiwan Chan. Biased Minimax Probability Machine for Medical Diagnosis. AMAI. 2004.

Then we apply it to two real-world medical diagnosis datasets, the breast-cancer dataset and the heart disease dataset. 4.1. A Synthetic Dataset A two-variable synthetic dataset is generated by the two-dimensional gamma distribution. Two classes of data are

Jeroen Eggermont and Joost N. Kok and Walter A. Kosters. Genetic Programming for data classification: partitioning the search space. SAC. 2004.

using different the sets of internal nodes. The same behavior is seen for k = 4 and k = 5. In all cases the discovered decision trees differ syntactically per fold and random seed. The Heart Disease Data Set The results on the Heart disease data set are displayed in Table 6. All our gp algorithms show a large improvement in misclassification performance over our simple gp algorithm. In all but two cases

David Page and Soumya Ray. Skewing: An Efficient Alternative to Lookahead for Decision Tree Induction. IJCAI. 2003.

Skewing ID3, No Skewing Figure 8: Five-Variable Hard Targets 50 60 70 80 90 100 200 400 600 800 1000 Accuracy (%) Sample Size ID3 with Skewing ID3, No Skewing Figure 9: Six-Variable Hard Targets Data Set Standard ID3 ID3 with Skewing Heart 71.9 74.5 Voting 94.0 94.2 Voting-2 87.4 88.6 Contra 60.4 61.5 Monks-1 92.6 100.0 Monks-2 86.5 89.3 Monks-3 89.8 91.7 Table 4: Accuracies of ID3 and ID3 with

Jinyan Li and Limsoon Wong. Using Rules to Analyse Bio-medical Data: A Comparison between C4.5 and PCL. WAIM. 2003.

For a simple comparison, we give the following statistics numbers: -- Comparing PCL, C4.5, Bagging and Boosting, PCL won the best accuracy on 5 data sets (i.e., breast-w, cleve, heart HIV, and promoter); Bagging won on 1 data set (hypothyroid); and Boosting won the best accuracy on 4 data sets (i.e., hepatitis, lymph, sick and splice). -- Comparing

Yuan Jiang Zhi and Hua Zhou and Zhaoqian Chen. Rule Learning based on Neural Network Ensemble. Proceedings of the International Joint Conference on Neural Networks. 2002.

greatly offsets its weakness in the conciseness of the generated rule sets. A typical rule set generated by the proposed algorithm is shown in Table 3, which is obtained from one run on the data set Heart disease. IV. CONCLUSIONS In this paper, we propose a novel rule learning algorithm that employs neural network ensemble as front-end process. The algorithm trains a neural network ensemble at

Baback Moghaddam and Gregory Shakhnarovich. Boosted Dyadic Kernel Discriminants. NIPS. 2002.

and k-Nearest Neighbor (k-NN). We chose sets large enough for reasonable training/validation/test partitioning, and that represent binary (or easily converted to binary) classification problems. Dataset N d k-NN SVM #SV Hypercuts #k.ev. Heart 90 13 .196 #.042 .202 #.038 62 #10 .202 #.030 50 #12 Ionosphere 120 34 .168 #.024 .064 #.018 73 #7 .083 #.022 63 #7 WBC 200 9 .034 #.011 .032 #.008 50 #26

Thomas Melluish and Craig Saunders and Ilia Nouretdinov and Volodya Vovk and Carol S. Saunders and I. Nouretdinov V.. The typicalness framework: a comparison with the Bayesian approach. Department of Computer Science. 2001.

Recognition Experiments In this section we compare the Bayesian-Transduction (BT) algorithm and the kernel perceptron when used within the typicalness framework. We ran experiments on two toy datasets, and the well-known heart benchmark data set. For the artificial data, one dataset was created using a uniform prior over w such that jjwjj = 1 (this is the correct prior for Bayesian

Robert Burbidge and Matthew Trotter and Bernard F. Buxton and Sean B. Holden. STAR - Sparsity through Automated Rejection. IWANN (1). 2001.

has 351 examples in 33 dimensions and is slightly noisy. The heart data set has 270 examples in 13 dimensions. The Pima Indians diabetes data set has 768 examples in eight dimensions. These last two data sets have a high degree of overlap which leads to a dense model for

Peter L. Hammer and Alexander Kogan and Bruno Simeone and Sandor Szedm'ak. R u t c o r Research R e p o r t. Rutgers Center for Operations Research Rutgers University. 2001.

are obtained by lexicographically Page 28 RRR 7-2001 Figure 1: Cost of Classification Inaccuracy for # = 0 0 5 10 15 20 25 30 Credit Breast Cancer Boston Housing Diabetes Heart Disease Oil Voting Datasets Mean Cost LAD StrongSpanned StrongPrime Prime Figure 2: Cost of Classification Inaccuracy for # = 0.5 0 5 10 15 20 25 30 35 40 Credit Breast Cancer Boston Housing Diabetes Heart Disease Oil Voting

Kristin P. Bennett and Ayhan Demiriz and John Shawe-Taylor. A Column Generation Algorithm For Boosting. ICML. 2000.

where the base learner solves (6) exactly, then to examine LPBoost in a more realistic environment. 5.1 Boosting Decision Tree Stumps We used decision tree stumps as a base learner on six UCI datasets: Cancer (9,699), Heart (13,297), Sonar (60,208), Ionosphere (34,351), Diagnostic (30,569), and Musk (166,476). The number of features and number of points in each dataset are shown in parentheses

Thomas G. Dietterich. An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Machine Learning, 40. 2000.

attribute values rather than as missing values. In auto, the class variable was the make of the automobile. In the breast cancer domains, all features were treated as continuous. The heart disease data sets were recoded to use discrete values where appropriate. All attributes were treated as continuous in the kingrook-vs-king (krk) data set. In lymphography, the lymph-nodes-dimin, lymph-nodes-enlar,

Lorne Mason and Peter L. Bartlett and Jonathan Baxter. Improved Generalization Through Explicit Optimization of Margins. Machine Learning, 38. 2000.

used in the experiments. Data Set Training Test Attributes Cleveland Heart Disease 103 100 14 Credit Application 100 295 15 German 300 350 24 Glass 70 72 10 Ionosphere 101 125 34 King Rook vs King Pawn 100 1000 36 Pima Indians

Endre Boros and Peter Hammer and Toshihide Ibaraki and Alexander Kogan and Eddy Mayoraz and Ilya B. Muchnik. An Implementation of Logical Analysis of Data. IEEE Trans. Knowl. Data Eng, 12. 2000.

rates ranging from 71.4% to 74.4%. Further, [19] reports a 76% correct prediction rate using 75% of the data for training. heart Disease (Cleveland). The Cleveland Clinic Foundation heart disease dataset, contributed to the repository by R. Detrano, contains 303 observations, 165 of which describe healthy people and 138 sick ones; 7 observations are incomplete, and 2 of the observations of healthy

Petri Kontkanen and Petri Myllym and Tomi Silander and Henry Tirri and Peter Gr. On predictive distributions and Bayesian networks. Department of Computer Science, Stanford University. 2000.

Dataset Data vectors Attributes Classes CV folds Heart Disease (HD) 270 14 2 9 Iris (IR) 150 5 3 5 Lymphography (LY) 148 19 4 5 Australian (AU) 690 15 2 10 Breast Cancer (BC) 286 10 2 11 Diabetes (DB) 768 9

Rudy Setiono and Wee Kheng Leow. FERNN: An Algorithm for Fast Extraction of Rules from Neural Networks. Appl. Intell, 12. 2000.

decision nodes may also improve the accuracy of the tree because samples from real world problems may be better separated by oblique hyperplanes. This is the case with the heart disease data set (HeartD in Table 2) where significant improvement is achieved by the neural network methods over C4.5. There is no significant difference in the accuracy and size of the decision trees generated by

Iñaki Inza and Pedro Larrañaga and Basilio Sierra and Ramon Etxeberria and Jose Antonio Lozano and Jos Manuel Peña. Representing the behaviour of supervised classification learning algorithms by Bayesian networks. Pattern Recognition Letters, 20. 1999.

treatment is done for unknown values, exploiting each algorithm its own characteristics. PEBLS and HOODG algorithms are not able to handle unknown values: thus, they are only used in the four datasets without unknown values (diabetes, heart liver and lymphography). For each database and algorithm, a classification model is induced using the specified training set: when run with fixed default

Yoav Freund and Lorne Mason. The Alternating Decision Tree Learning Algorithm. ICML. 1999.

representation. To demonstrate our interpretation, we consider the alternating tree presented in Figure 4. This tree is the result of running our learning algorithm for six iterations on the cleve data set from Irvine. This is a data set of heart disease diagnostics for which the goal is to discriminate between sick and healthy people 3 In our mapping positive classification correspond to healthy and

Jinyan Li and Xiuzhen Zhang and Guozhu Dong and Kotagiri Ramamohanarao and Qun Sun. Efficient Mining of High Confidience Association Rules without Support Thresholds. PKDD. 1999.

rules and some very high (say 90%) confidence rules using approaches similar to mining top rules. Experimental results using the Mushroom, the Cleveland heart disease, and the Boston housing datasets are reported to evaluate the efficiency of the proposed approach. 1 Introduction Association rules [1] were proposed to capture significant dependence between items in transactional datasets. For

Chun-Nan Hsu and Hilmar Schuschel and Ya-Ting Yang. The ANNIGMA-Wrapper Approach to Neural Nets Feature Selection for Knowledge Discovery and Data Mining. Institute of Information Science. 1999.

starting with 0. Unknown values are set to 0.5. heart Disease This dataset concerning heart disease diagnosis contains 4 sub-databases col16 lected from 4 locations. Each database has the same instance format. We used two of them in our experiment: one from Cleveland

Kai Ming Ting and Ian H. Witten. Issues in Stacked Generalization. J. Artif. Intell. Res. (JAIR, 10. 1999.

situation. The performance variation among the member models in bagging is rather small because they are derived from the same learning algorithm using bootstrap samples. Section 3.3 4. The heart dataset used by Breiman (1996b; 1996c) is omitted because it was very much modified from the original one. 284 Issues in Stacked Generalization shows that a small performance variation among member models

Jan C. Bioch and D. Meer and Rob Potharst. Bivariate Decision Trees. PKDD. 1997.

with the standard error. From these table we can conclude 10 name cases attr classes glass 214 9 6 diabetes(pima) 768 8 2 breast cancer 699 9 2 heart 270 13 2 wave 300 21 3 Table 1: Summary of the Datasets method glass diabetes cancer heart wave BIT1 65.3Sigma1:1 74.3Sigma0:7 95.4Sigma0:3 78.5Sigma0:3 76.1Sigma1:3 6.2Sigma2:1 5.2Sigma2:5 2.8Sigma0:2 4.1Sigma0:5 5.0Sigma1:6 BIT2

D. Randall Wilson and Roel Martinez. Machine Learning: Proceedings of the Fourteenth International Conference, Morgan. In Fisher. 1997.

seem to be especially well suited for these reduction techniques. For example, RT3 required less than 2% storage for the Heart Swiss dataset, yet it achieved even higher generalization accuracy than the kNN algorithm. On the other hand, some datasets were not so appropriate. On the Vowel dataset, for example, RT3 required over 45% of the

Pedro Domingos. Control-Sensitive Feature Selection for Lazy Learners. Artif. Intell. Rev, 11. 1997.

and Robert Detrano, of the V.A. Medical Center, Long Beach and Cleveland Clinic Foundation, for supplying the heart disease dataset. Please see the documentation in the UCI Repository for detailed information on all datasets. Appendix A This appendix describes how, for each one of P prototypes, the relevant features are chosen

Floriana Esposito and Donato Malerba and Giovanni Semeraro. A Comparative Analysis of Methods for Pruning Decision Trees. IEEE Trans. Pattern Anal. Mach. Intell, 19. 1997.

available in the UCI Machine Learning Repository 2 [21], and some of them have even been used to compare different pruning methods [25], [20], [3]. The database heart is actually the union of four data sets on heart diseases, with the same number of attributes but collected in four distinct places (Hungary, Switzerland, Cleveland, and Long Beach). 3 Of the 76 original attributes, only 14 have been

Rudy Setiono and Huan Liu. NeuroLinear: From neural networks to oblique decision rules. Neurocomputing, 17. 1997.

only 2 attributes and achieve higher accuracy rate on the testing data. The separator generated from the pruned network is depicted in Fig. 5. B. Detailed analysis 2: Cleveland Heart Disease Dataset. The dataset consists of 303 patterns. We discarded patterns with missing attribute values and used only the remaining 297 patterns. The patterns were divided randomly into training and testing set.

Prototype Selection for Composite Nearest Neighbor Classifiers. Department of Computer Science University of Massachusetts. 1997.

resulted in a fairly small number of prototypes that can achieve a very good level of classification accuracy. For example, Aha's IB3 algorithm achieves 79% accuracy on the Cleveland heart disease data set [Murphy and Aha, 49 1994] while retaining only approximately 4% of the 303 instances [Aha, 1990] . Results such as this hint that a small number of prototypes will suffice on some data. In Table

Igor Kononenko and Edvard Simec and Marko Robnik-Sikonja. Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF. Appl. Intell, 7. 1997.

obtained from the StatLog database[18]: diagnosis of diabetes (DIAB) and diagnosis of HEART diseases (HEART). For the DIAB data set, Ragavan & Rendell [27]report 78.8% classification accuracy with their LFC algorithm. They also report poor performance of 12 THE AUTHORS??? Table 8 Basic description of the medical data sets domain

Kamal Ali and Michael J. Pazzani. Error Reduction through Learning Multiple Descriptions. Machine Learning, 24. 1996.

to test learned models on noise-free examples (including noisy variants of the KRK and LED domains) but for the natural domains we tested on possibly noisy examples. The large variant of the Soybean data set was used and the 5-class variant of the Heart data set was used. 5.1. Does using multiple rule sets lead to lower error? In this section we present results of an experiment designed to answer the

Ron Kohavi. The Power of Decision Tables. ECML. 1995.

we expected IDTM to fail miserably, given that the chances of matching continuous features in the table are slim without preprocessing the data. Although C4.5 clearly outperforms IDTM on most datasets, IDTM outperforms C4.5 on the heart dataset and achieves similar performance on nine out of the 22 datasets (australian, cleve, crx, german, hepatitis, horse-colic, iris, lymphography, and

Ron Kohavi and Dan Sommerfield. Feature Subset Selection Using the Wrapper Method: Overfitting and Dynamic Search Space Topology. KDD. 1995.

in error. The execution time on a Sparc20 for feature subset selection using ID3 ranged from under five minutes for breast-cancer (Wisconsin), cleve, heart and vote to about an hour for most datasets. DNA took 29 hours, followed by chess at four hours. The DNA run took so long because of ever increasing estimates that did not really improve the test-set accuracy. 7 Conclusions We reviewed the

Peter D. Turney. Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. CoRR, csAI/9503102. 1995.

together, since their test costs have different scales (see Appendix A). The test costs in the Heart Disease dataset, for example, are substantially larger than the test costs in the other four datasets. Third, it is difficult to combine average costs for different values of k in a fair manner, since more weight

Gabor Melli. A Lazy Model-Based Approach to On-Line Classification. University of British Columbia. 1989.

The five selected datasets were: echocardiogram, hayes-roth, heart horse-colic,andiris datasets. These datasets (marked in Table 7.1 with a * symbol beside their name) contain a sampling of attribute types and domains. For

Elena Smirnova and Ida G. Sprinkhuizen-Kuyper and I. Nalbantis and b. ERIM and Universiteit Rotterdam. Unanimous Voting using Support Vector Machines. IKAT, Universiteit Maastricht.

the hypothesis space H contains the target hyperplane, the hyperplane is consistent with the training data; i.e., it belongs to the version space [7, 11]. Thus, the unanimous-voting classification Data Set Parameters Cvssvm Avssvm Asvm I Heart Statlog P, E=2.0, C=1730 56.3% 100% 73.0% 0.42 Heart-Statlog RBF, G=0.2 , C=2182 40.7% 100% 73.7 % 0.24 Hepatitis P, E=1.4, C=11.7 80.0% 100% 80.0 % 0.72

Krista Lagus and Esa Alhoniemi and Jeremias Seppa and Antti Honkela and Arno Wagner. INDEPENDENT VARIABLE GROUP ANALYSIS IN LEARNING COMPACT REPRESENTATIONS FOR DATA. Neural Networks Research Centre, Helsinki University of Technology.

models optimized carefully using the IVGA implementation. The model search of our IVGA implementation was able to discover the best grouping, i.e., the one with the smallest cost. 3.2. Arrhythmia data set The identification of different types of heart problems, namely cardiac arrhythmias, is carried out based on electrocardiography measurings from a large number of electrodes. We used a freely

Chiranjib Bhattacharyya and Pannagadatta K. S and Alexander J. Smola. A Second order Cone Programming Formulation for Classifying Missing Data. Department of Computer Science and Automation Indian Institute of Science.

of the UCI database. From left to right: Pima, Ionosphere, and Heart dataset. Top: small fraction of data with missing variables (50%), Bottom: large number of observations with missing variables (90%) The experimental results are summarized by the graphs(1). The robust

Ayhan Demiriz and Kristin P. Bennett. Chapter 1 OPTIMIZATIONAPPROACHESTOSEMI-SUPERVISED LEARNING. Department of Decision Sciences and Engineering Systems & Department of Mathematical Sciences, Rensselaer Polytechnic Institute.

an action reduces the overall error. Like S µ VM-IQP, SVM-Light alternates 16 APPLICATIONS AND ALGORITHMS OF COMPLEMENTARITY Table 1.3 Average Error Results for Transductive and Inductive Methods Data Set SVM-QP SVM-Light S µ VM-IQP Heart 0.16 0.163 0.1966 Housing 0.1804 0.1608 0.1647 Ionosphere 0.0857 0.1572 0.0943 Sonar 0.1762 0.2524 0.1572 the labels to avoid local minima. The primary difference

Adil M. Bagirov and John Yearwood. A new nonsmooth optimization algorithm for clustering. Centre for Informatics and Applied Optimization, School of Information Technology and Mathematical Sciences, University of Ballarat.

small clusters. We can also see that for c 2 [1.5, 4] the number of instances and CPU time reduce significantly. The results presented in Table 2 show that appropriate values for the heart disease data set are c 2 [0, 1.5], because further decrease in c leads to changes in the cluster structure of the data set. We can again see that these values of c allow significant reduction in the number of

Adil M. Bagirov and Alex Rubinov and A. N. Soukhojak and John Yearwood. Unsupervised and supervised data classification via nonsmooth and global optimization. School of Information Technology and Mathematical Sciences, The University of Ballarat.

1.0 7 152/297 26.73 1.5 6 122/297 14.43 2.0 5 107/297 8.25 4.0 5 65/297 5.05 6.0 5 41/297 3.34 8.0 5 28/297 3.22 The results presented in Table 9 show that appropriate values for the heart disease dataset are c 2 [0, 1.5], because further decrease in c leads to changes in the cluster structure of the dataset. We can again see that these values of c allow significant reduction in the number of

Bruce H. Edmonds. Using Localised `Gossip' to Structure Distributed Learning. Centre for Policy Modelling.

in which they are found to work. This approach is compared to the equivalent global evolutionary computation approach with respect to predicting the occurrence of heart disease in the Cleveland data set. It outperforms a global approach, but the space of attributes within which this evolutionary process occurs can greatly effect the efficiency of the technique. 1. Introduction The idea here is to

Kristin P. Bennett and Erin J. Bredensteiner. Geometry in Learning. Department of Mathematical Sciences Rensselaer Polytechnic Institute.

Heart disease status is known. By evaluating a new patient's attributes with respect to the separating plane a diagnosis is made. The Cleveland Heart Disease Database (Heart) is a publicly available dataset that contains information on 297 patients using 13 attributes [6]. A second application, as discussed previously, is the diagnosis of breast cancer. To evaluate whether a tumor is benign or

Rafael S. Parpinelli and Heitor S. Lopes and Alex Alves Freitas. PART FOUR: ANT COLONY OPTIMIZATION AND IMMUNE SYSTEMS Chapter X An Ant Colony Algorithm for Classification Rule Discovery. CEFET-PR, Curitiba.

namely Ljubljana breast cancer, Wisconsin breast cancer, Hepatitis and Heart disease. In two data sets, Ljubljana breast cancer and Heart disease, the difference was quite small. In the other two data sets, Wisconsin breast cancer and Hepatitis, the difference was more relevant. Note that although

Wl/odzisl/aw Duch and Karol Grudzinski and Geerd H. F Diercksen. Minimal distance neural methods. Department of Computer Methods, Nicholas Copernicus University.

of the number of neighbors and for vowel the r-NN method gives 57.8% accuracy, but in both TABLE I The appendicitis, Wisconsin breast cancer data, hepatitis and the Cleveland heart data. Dataset and method Leave-one-out % The appendicitis data Bayes rule (statistical) 83.0 CART, C4.5 (dec. trees) 84.9 MLP+backpropagation 85.8 RIAC (prob. inductive) 86.9 9-NN 89.6 PVM, C-MLP2LN (logical

John G. Cleary and Leonard E. Trigg. Experiences with OB1, An Optimal Bayes Decision Tree Learner. Department of Computer Science University of Waikato.

however, naive Bayes performs very well, and on some datasets (such as heart c and labor) it performs considerably better than the OB1 results shown (presumably because its attribute independence assumption isn't violated). The next section investigates

Glenn Fung and Sathyakama Sandilya and R. Bharat Rao. Rule extraction from Linear Support Vector Machines. Computer-Aided Diagnosis & Therapy, Siemens Medical Solutions, Inc.

from the UCI Machine Learning Repository [13]: Wisconsin Diagnosis Breast Cancer (WDBC), Ionosphere, and Cleveland heart The fourth dataset is a dataset related to the nontraditional authorship attribution problem related to the federalist papers [7] and the fifth dataset is a dataset used for training in a computer aided detection

Ayhan Demiriz and Kristin P. Bennett and John Shawe and I. Nouretdinov V.. Linear Programming Boosting via Column Generation. Dept. of Decision Sciences and Eng. Systems, Rensselaer Polytechnic Institute.

and in the stopping criteria. Both methods were allowed the same maximum number of iterations. 8.1. Boosting Decision Tree Stumps We used decision tree stumps as base hypotheses on the following six datasets: Cancer (9,699), Diagnostic (30,569), Heart (13,297), Ionosphere (34,351), Musk (166,476), and Sonar (60,208). The number of features and number of points in each dataset are shown, respectively,

Zhi-Hua Zhou and Xu-Ying Liu. Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem.

0.85. On euthyroid, threshold-moving is the best, under-sampling is the worst in the effective range, while the ensemble methods become poor when PCF(+) is bigger than 0.8. On the remaining nine data sets all the methods work well. On heart s the ensemble methods are slightly better than others. On heart the ensemble methods are apparently better than over-sampling, under-sampling, and

Liping Wei and Russ B. Altman. An Automated System for Generating Comparative Disease Profiles and Making Diagnoses. Section on Medical Informatics Stanford University School of Medicine, MSOB X215.

profile instead of using all attributes in the original clinical data. The results remain the same. RESULTS We evaluated the system by applying it to heart disease, diabetes, and breast cancer. All data sets were obtained from the UCI Repository of Machine Learning databases and domain theories. 7 Heart Disease Four clinical data sets were used. These sets consists of patients who had been referred for

Federico Divina and Elena Marchiori. Handling Continuous Attributes in an Evolutionary Inductive Learner. Department of Computer Science Vrije Universiteit.

The other datasets (Echocardiogram, Glass 2, Heart and Hepatitis) are small, and the results of the experiments are not normally distributed, so the t-test cannot be applied. Dataset ECL-LSDc ECL-LSDf ECL-LUD

Ron Kohavi and George H. John. Automatic Parameter Selection by Minimizing Estimated Error. Computer Science Dept. Stanford University.

by replacing a node's test with the test at one of its children, so perhaps m=1 gives more latitude in the pruning phase. Information-gain (turning the g parameter on) was a big winner on several datasets: vehicle, segment, hypothyroid, heart and cleve. Turning on the s parameter helped in tic-tactoe and monk1. Table 5: Experimental results: Accuracies for C4.5, C4.5-AP, and C4.5* from running on

H. -T Lin and C. -J Lin. A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-type Methods. Department of Computer Science and Information Engineering National Taiwan University.

with ~ C = ¯ C 2a = decision value at x using the linear kernel with ¯ C. We can observe the result of Theorems 8 and 9 from Figure 1. The contours show five-fold cross-validation accuracy of the data set heart in different r and C. The contours with a = 1 are on the left-hand side, while those with a = 0.01 are on the right-hand side. Other parameters considered here are log 2 C fromlog 2 (- r) from

Alexander K. Seewald. Dissertation Towards Understanding Stacking Studies of a General Ensemble Learning Scheme ausgefuhrt zum Zwecke der Erlangung des akademischen Grades eines Doktors der technischen Naturwissenschaften.

#18 Training set (8-9=CV, 7=75%, 6=62%,.. 1=25%) Hold-out accuracy Figure 6.3: Learning curves for dataset heart c to lymph. 57 0 1 2 3 4 5 6 7 8 9 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 Learncurve for Dataset #19 Training set (8-9=CV, 7=75%, 6=62%,.. 1=25%) Hold-out accuracy 0 1 2 3 4 5 6 7 8 9 0.86

Wl odzisl and Rafal Adamczak and Krzysztof Grabczewski and Grzegorz Zal. A hybrid method for extraction of logical rules from data. Department of Computer Methods, Nicholas Copernicus University.

2 ! 9 ^ f 5 ! 6 ^ f 7 ! 9 ^ f 8 ! 5 R 7 ) f 2 ! 9 ^ f 4 ! 6 ^ f 5 ! 8 ^ f 7 ! 9 R 8 ) f 2 =6^ f 4 ! 10 ^ f 5 ! 10 ^ f 7 ! 2 ^ f 8 ! 9 B. The Cleveland heart disease data. The Cleveland heart disease dataset [14] (collected at V.A. Medical Center, Long Beach and Cleveland Clinic Foundation by R. Detrano) contains 303 instances, with 164 healthy (54.1%) instances, the rest are heart disease instances of

Wl odzisl/aw Duch and Karol Grudzinski. Search and global minimization in similarity-based methods. Department of Computer Methods, Nicholas Copernicus University.

features, the same as we have selected before using our logical rule extraction methods. Using all features the accuracy of 87.8±1.1% was achieved, TABLE I RESULTS FOR THE CLEVELAND HEART DISEASE DATASET. Method Accuracy % Reference IncNet 90.0 [17] k-NN, k=28, 7 features 85.1±0.5 this paper Linear Discriminant Anal. 84.5 [16] Fisher LDA 84.2 [16] k-NN, k=16 84.0±0.6 this paper FSM, Feature Space

Rudy Setiono and Wee Kheng Leow. Generating rules from trained network using fast pruning. School of Computing National University of Singapore.

decision nodes may also improve the accuracy of the tree because samples from real world problems may be better separated by oblique hyperplanes. This is the case with the heart disease data set (HeartD in Table 2) where significant improvement is achieved by the neural network methods over C4.5. There is no significant difference in the accuracy and size of the decision trees generated by

Return to Heart Disease data set page.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML