Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

Census-Income (KDD) Data Set

Below are papers that cite this data set, with context shown. Papers were automatically harvested and associated with this data set, in collaboration with Rexa.info.

Return to Census-Income (KDD) data set page.


Eibe Frank and Geoffrey Holmes and Richard Kirkby and Mark A. Hall. Racing Committees for Large Datasets. Discovery Science. 2002.

LogitBoost #Iterations Racing w/o pruning Racing w pruning anonymous 27.00% 60 28.24% 27.56% adult 13.51% 67 14.58% 14.72% shuttle 0.01% 86 0.08% 0.07% census income 4.43% 448 4.90% 4.93% The next dataset we consider is census-income. The first row of Figure 4 shows the results. The most striking aspect is the effect of pruning with small chunk sizes. In this domain the fluctuation in error is


Nikunj C. Oza and Stuart J. Russell. Experimental comparisons of online and batch versions of bagging and boosting. KDD. 2001.

used in our experiments. For the Soybean and Census Income datasets, we have given the sizes of the supplied training and test sets. For the remaining datasets, we have given the sizes of the training and test sets in our #ve-fold crossvalidation runs. Data Set


Stephen D. Bay. Multivariate Discretization for Set Mining. Knowl. Inf. Syst, 3. 2001.

for Census Income We required differences between adjacent cells to be at least as large as 1% of N . ME-MDL requires a class variable and for the Adult, Census-Income, SatImage, and Shuttle datasets we used the class variable that had been used in previous analyses. For UCI Admissions we used Admit = fyes, nog (i.e. was the student admitted to UCI) as the class variable. 6.1. Execution Time


Masahiro Terabe and Takashi Washio and Hiroshi Motoda. The Effect of Subsampling Rate on S 3 Bagging Performance. Mitsubishi Research Institute.

each member classifier induction. A personal computer having the specification of OS: Linux OS, CPU: PentiumIII 700 MHz, and main memory: 256 M bytes is used in this experiment. For the large size data sets, census income (abbreviated here as census), led(10%) and waveform are selected. Census is selected from UCI KDD Table 2. The specification of data sets for experiment 2. Data set # of Attribute


Return to Census-Income (KDD) data set page.

Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML