Flags Data Set

The C4.5 program is the program that comes with Quinlan's book (Quinlan 1992); the ID3 results were obtained by running C4.5 and using the unpruned trees. On the artificial datasets, we used the ``-s -m1'' C4.5 flags which indicate that subset splits may be used and that splitting should continue until purity. To estimate the accuracy for feature subsets, we used 25-fold

octag yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no yes no red yellow green blue red yellow green blue red yellow green blue sword balloon flag Figure 5: The Monk3 dataset (top), the concept predicted by ID3 (center), and the errors in black (bottom). 14 0 100 200 300 400 TS size Monk1 0.7 0.75 0.8 0.85 0.9 0.95 1 Accuracy HOODG ID3 Figure 6: Learning curves generated

Both FSM network and k-NN algorithm achieved similar results (within statistical accuracy) on the raw data as on the converted data (the number of attributes did not change here). On the Flags dataset, containing 194 samples, 3 continuous and 25 symbolic attributes, 8 classes (majority rate 30.8%) and no missing values, 10-fold cross-validation tests were performed. Large improvement is observed

