Tic-Tac-Toe Endgame Data Set
Below are papers that cite this data set, with context shown.
Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.
Return to Tic-Tac-Toe Endgame data set page.
Saher Esmeir and Shaul Markovitch. Lookahead-based algorithms for anytime induction of decision trees. ICML. 2004.
that when additional time is available LSID3 should be preferred over ID3-k for the task of learning this concept. Figure 6 shows the performance of both algorithms when applied on the Tic-tac-toe dataset. In this case, the average run time of ID3 is 0.004 second, while C4.5 finished in 0.008 second. LSID3 dominates ID3-k at any point of time, both in terms of accuracy and size. ID3-k is clearly not
Bart Hamers and J. A. K Suykens. Coupled Transductive Ensemble Learning of Kernel Models. Bart De Moor. 2003.
second layer we will always use the sparse formulation (7), with number of support models SM · q. This sparse solution also improves the performance as observed in the experiments. 5.2.1 Tic-Tac-Toe Data Set The Tic-Tac-Toe Endgame database is a UCI (Blake and Merz (1998)) data set contributed by Aha, encoding the complete set of possible board configurations at the end of the tictac-toe games. The
Michael Bain. Structured Features from Concept Lattices for Unsupervised Learning and Classification. Australian Joint Conference on Artificial Intelligence. 2002.
for problems of constructive induction (e.g. ), the tic-tac-toe data set from the UCI repository. The problem is to predict whether each of 958 legal endgame boards for tic-tac-toe is won for `x'. The results are shown in Table 3. Accuracy (%) Tree size Attributes
Jinyan Li and Kotagiri Ramamohanarao and Guozhu Dong. Combining the Strength of Pattern Frequency and Distance for Classification. PAKDD. 2001.
containing pure categorical attributes. The accuracy is sometimes better (e.g., on the tic-tac-toe data set), but sometimes worse (e.g., on the splice data set). 4.2 Accuracy Variation among Folds The next set of experimental results are used to demonstrate the accuracy variations among the ten folds. We
Jochen Garcke and Michael Griebel and Michael Thess. Data Mining with Sparse Grids. Computing, 67. 2001.
17572 0.1 94.4 % 94.8 % 4019 324 0.01 98.0 % 98.0 % 7491 692 43500 0.001 99.5 % 99.5 % 32587 2137 0.0001 99.7 % 99.6 % 74642 6776 0.00001 99.8 % 99.8 % 203227 19930 Table 8: Results for the Shuttle data set, only Level 1 3.4.2 Tic-Tac-Toe This data set comes from the UCI Machine Learning Repository . The 958 instances encode the complete set of possible board configurations at the end of tic-tac-toe
Alexey Tsymbal and Seppo Puuronen and Vagan Y. Terziyan. Arbiter Meta-Learning with Dynamic Selection of Classifiers and Its Experimental Investigation. ADBIS. 1999.
from the UCI machine learning repository: three MONK's problem datasets donated by Sebastian Thrun and the Tic-Tac-Toe Endgame dataset donated by David W. Aha . The MONK's problems are a collection of three artificial binary classification problems over the same
Stephen D. Bay. Nearest neighbor classification from multiple feature subsets. Intell. Data Anal, 3. 1999.
is a board game playedona3by 3 grid. There are two players X and O, who alternate putting X's and O's on the board. In order to win a player must get three X's or O's in a row. The tic-tac-toe dataset contains the 958 legal tic-tac-toe endgame boards (assuming X moves first). The classification task is to determine if X won the game. 11 Table 2: Classifier error rates. Domain NN kNN FSS BSS MFS1
Stephen D. Bay. Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets. ICML. 1998.
comparison, we used the Wilcoxon signed rank test and found that MFS1 and MFS2 were significantly better than all others with a confidence level greater than 99%. MFS only performed poorly on two datasets: Iris and Tic-Tac-Toe For Iris, both MFS1 and MFS2 gave the lowest accuracy out of all the classifiers. This can possibly be explained by the small number of features in the Iris dataset. With
Ron Kohavi. The Power of Decision Tables. ECML. 1995.
to concepts where some features are globally relevant; the feature subset selection algorithm used here is conducting a best-first search and is thus able to capture interactions. The tic-tac-toe dataset is an example where features are locally relevant; the Monk1, Monk2, and parity5+5 datasets have feature interactions. 5 Related Work Because they permit one to display succinctly the conditions
Jerome H. Friedman and Ron Kohavi and Youngkeol Yun. To appear in AAAI-96 Lazy Decision Trees. Statistics Department and Stanford Linear Accelerator Center Stanford University.
monk1, monk2, and monk3; and pseudo-artificial datasets: tic-tac-toe and chess. Hayes-roth and glass2 also have large differences probably because they have many strongly relevant features and few weakly relevant features (John, Kohavi, & Pfleger
Christophe G. Giraud-Carrier and Tony Martinez. AN INCREMENTAL LEARNING MODEL FOR COMMONSENSE REASONING. Department of Computer Science Brigham Young University.
1. If representative voted 'no' on the 'physician-fee-freeze' issue, then representative is a democrat tic-tac-toe dataset: 1. If top row of grid is 'XXX', then 'X' wins 2. If middle row of grid is 'XXX', then 'X' wins 3. If bottom row of grid is 'XXX', then 'X' wins 4. If top row of grid is 'OOO', then 'O' wins The
Rafael S. Parpinelli and Heitor S. Lopes and Alex Alves Freitas. PART FOUR: ANT COLONY OPTIMIZATION AND IMMUNE SYSTEMS Chapter X An Ant Colony Algorithm for Classification Rule Discovery. CEFET-PR, Curitiba.
the difference between the number of rules discovered by Ant-Miner and C4.5 is quite large, as follows. In the Tic-tac-toe and Dermatology data sets Ant-Miner discovered 8.5 and 7.0 rules, respectively, whereas C4.5 discovered 83 and 23.2 rules, respectively. In both data sets C4.5 achieved a better accuracy rate. So, in these two data sets
Ron Kohavi and George H. John. Automatic Parameter Selection by Minimizing Estimated Error. Computer Science Dept. Stanford University.
Significantly, the value 25% (the default value in C4.5) was chosen for all datasets except tic-tac-toe (where C4.5-AP halved the error rate of C4.5) and glass2 (where C4.5-AP improved absolute accuracy by over 4%). For the m parameter, 1 was often the best choice, or a tie for
Jinyan Li and Kotagiri Ramamohanarao and Guozhu Dong. ICML2000 The Space of Jumping Emerging Patterns and Its Incremental Maintenance Algorithms. Department of Computer Science and Software Engineering, The University of Melbourne, Parkville.
in UCI repository (Blake & Murphy, 1998) to experimentally examine the maintenance algorithms, especially their efficiency. These data sets are mushroom, pima, tic-tac-toe and nursery. More details can be seen in Table 1. Note that the continuous attributes in the pima data set are discretized by MLC++ techniques (Kohavi et al, 1994).
Masahiro Terabe and Takashi Washio and Hiroshi Motoda. The Effect of Subsampling Rate on S 3 Bagging Performance. Mitsubishi Research Institute.
training data even when the subsampling rate is set up to 30%. In the case of Tic-tac-toe the prediction accuracy becomes better when the subsampling rate set large. Tic-tac-toe is an artifical data set which does not include any redundant instances. By this feature of data set, the experimental result of tictac-toe is characteristic. The effect of preventing the deterioration of prediction
David R. Musicant. DATA MINING VIA MATHEMATICAL PROGRAMMING AND MACHINE LEARNING. Doctor of Philosophy (Computer Sciences) UNIVERSITY.
we used contained 22 features with 200 points in class 1 and 300 points in class -1. . The tic-tac-toe dataset is a two class dataset that contains legal complete tic-tac-toe games. All those games where "X" has a "win" (three in a row) end up in category 1, and all other games end up in category -1. We have
C. Titus Brown and Harry W. Bullen and Sean P. Kelly and Robert K. Xiao and Steven G. Satterfield and John G. Hagedorn and Judith E. Devaney. Visualization and Data Mining in an 3D Immersive Environment: Summer Project 2003.
the Museum Environment . . . . . . . . . . 7 3.4 Visual Component Tools . . . . . . . . . . . . . . . . . . . . . . 8 3.5 Dataflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4 Data Sets Analysed 17 4.1 Tic-Tac-Toe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 Miles Per Gallon . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Housing . . . . . . . . . .
Ron Kohavi and Brian Frasca. Useful Feature Subsets and Rough Set Reducts. the Third International Workshop on Rough Sets and Soft Computing.
increasing the running time considerably is to dynamically decide on the number of bootstrap samples needed. An abstract description of this problem is described in (Kohavi 1994). For the artificial datasets: Monk 1-3, parity, and tic-tac-toe the test sets include the space of all possible instances, and therefore the test set accuracy is the actual real accuracy. For the real datasets, our accuracy
Shi Zhong and Weiyu Tang and Taghi M. Khoshgoftaar. Boosted Noise Filters for Identifying Mislabeled Data. Department of Computer Science and Engineering Florida Atlantic University.
Recall Precision BF BBF-1 BBF-12 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 40% noise Recall Precision BFBBF-3 BBF-36 (e) (f) Figure 13. Noise detection results on the tic-tac-toe dataset with six different noise levels: (a) 5%; (b) 10%; (c) 15%; (d) 25%; (e) 35%; and (f) 40%. 0 0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5% noise Recall Precision BF BBF-2 BBF-21 0 0.2