Predicting the Cellular Localization Sites of Proteins

Predicted Attribute: Localization site of protein. ( non-numeric ). The references below describe a predecessor to this dataset and its development. They also give results (not cross-validated) for classification by a rule-based expert system with that version of the dataset. Reference: "Expert Sytem for Predicting Protein Localization Sites in Gram-Negative Bacteria", Kenta Nakai & Minoru Kanehisa, PROTEINS: Structure, Function, and Genetics 11:95-110, 1991. Reference: "A Knowledge Base for Predicting Protein Localization Sites in Eukaryotic Cells", Kenta Nakai & Minoru Kanehisa, Genomics 14:897-911, 1992.

Variable NameRoleTypeDemographicDescriptionUnitsMissing Values
Sequence_NameIDCategoricalAccession number for the SWISS-PROT databaseno
mcgFeatureContinuousMcGeoch's method for signal sequence
gvhFeatureContinuousvon Heijne's method for signal sequence
almFeatureContinuousScore of the ALOM membrane spanning region prediction
mitFeatureContinuousScore of discriminant analysis of the amino acid content of the N-terminal region (20 residues long) of mitochondrial and non-mitochondrial
erlFeatureContinuousPresence of HDEL substring (thought to act as a signal for retention in the endoplasmic reticulum lumen). Binary
poxFeatureContinuousPeroxisomal targeting signal in the
vacFeatureContinuousScore of discriminant analysis of the amino acid content of vacuolar and extracellular
nucFeatureContinuousScore of discriminant analysis of nuclear localization signals of nuclear and non-nuclear

Papers Citing this Dataset

On Possibility and Impossibility of Multiclass Classification with Rejection

By Chenri Ni, Nontawat Charoenphakdee, Junya Honda, Masashi Sugiyama. 2019

Published in ArXiv.

Incremental kernel PCA and the Nystr"om method

By Fredrik Hallgren, Paul Northrop. 2018

Published in ArXiv.

Degrees of Freedom and Model Selection for k-means Clustering

By David Hofmeyr. 2018

Published in ArXiv.

Multi-Resolution Dual-Tree Wavelet Scattering Network for Signal Classification

By Amarjot Singh, Nick Kingsbury. 2017

Published in ArXiv.

A Siamese Deep Forest

By Lev Utkin, Mikhail Ryabinin. 2017

Published in ArXiv.

19 citations


Kenta Nakai


