Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Statlog (German Credit Data) Data Set

Below are papers that cite this data set, with context shown. Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.

Return to Statlog (German Credit Data) data set page.


Jeroen Eggermont and Joost N. Kok and Walter A. Kosters. Genetic Programming for data classification: partitioning the search space. SAC. 2004.

C4.5 or our simple gp. A positive aspect of the refined gp algorithms using the gain ratio criterion is that the standard deviations are lower than for our other algorithms. Table 4: German credit data set results algorithm k average s.d. best worst rank clustering gp 2 27.8 0.7 26.3 28.8 4 clustering gp 3 28.0 0.8 27.0 29.8 6 clustering gp 4 27.9 0.9 26.7 29.4 5 clustering gp 5 28.4 0.8 26.9 29.5 11


Ke Wang and Shiyu Zhou and Ada Wai-Chee Fu and Jeffrey Xu Yu. Mining Changes of Classification by Correspondence Tracing. SDM. 2003.

for each small interval to have a separate classification characteristics, either having a different class or having higher accuracy. 4 Experiments We evaluated the proposed method on two real-life data sets, German Credit Data from the UCI Repository of Machine Learning Databases [14], and IPUMS Census Data from [1]. These data sets were chosen because no special knowledge is required to understand


Avelino J. Gonzalez and Lawrence B. Holder and Diane J. Cook. Graph-Based Concept Learning. FLAIRS Conference. 2001.

Voting Records Database available from the UCI machine learning repository (Keogh et. al 1998). The diabetes domain is the Pima Indians Diabetes Database, and the credit domain is the German Credit Dataset from the Statlog Project Databases (Keogh et. Al 1998). The Tic-Tac-Toe domain consists of 958 exhaustively generated examples. Positive examples are those where "X" starts moving and wins the game


Oya Ekin and Peter L. Hammer and Alexander Kogan and Pawel Winter. Distance-Based Classification Methods. e p o r t RUTCOR ffl Rutgers Center for Operations Research ffl Rutgers University. 1996.

653 instances with 15 attributes each. Carter and Catlett [3] reported an 85.5% correct prediction rate, when using 71% of all 690 instances as the training set. 4.6 German Credit (Statlog) This data set contains data used to evaluate credit applications in Germany. It has 1000 instances. We used a version of this data set that was produced by Strathclyde University. In this version each case is


Chotirat Ann and Dimitrios Gunopulos. Scaling up the Naive Bayesian Classifier: Using Decision Trees for Feature Selection. Computer Science Department University of California.

336 instances, 8 attributes, 8 classes. Attributes selected by SBC = 4. German Credit 64 66 68 70 72 74 76 78 10203040506070809099 Training Data (%) Accuracy (%) NBC SBC C4.5 Figure 3. German Credit dataset. 1,000 instances, 20 attributes, 2 classes. Attributes selected by SBC = 6. Chess Endgames (kr-vs-kp) 75 80 85 90 95 100 10 20 30 40 50 60 70 80 90 99 Training Data (%) Accuracy (%) NBC SBC C4.5


Paul O' Dea and David Griffith and Colm O' Riordan. DEPARTMENT OF INFORMATION TECHNOLOGY. P. O'Dea (NUI.

network (as described earlier), using back-propagation as a learning algorithm, is used to classify the tuples t 1 : : : t n based on the attributes s 1 : : : s n . 5 Results 5.1 The German Credit Data Set In order to facilitate testing of the developed approach, experiments were conducted using the german credit data set 1 . The german credit data set contains information on 1000 loan applicants.


Paul O' Dea and Josephine Griffith and Colm O' Riordan. Combining Feature Selection and Neural Networks for Solving Classification Problems. Information Technology Department, National University of Ireland.

network (as described earlier), using back-propagation as a learning algorithm, is used to classify the tuples ѷ#Ǯǫ} based on the attributes #ǫǮǷ . 5 Results 5.1 The German Credit Data Set In order to facilitate testing of the developed approach, experiments were conducted using the german credit data set 1 . The german credit data set contains information on 1000 loan applicants.


Return to Statlog (German Credit Data) data set page.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML