Ecoli

Donated on 9/1/1996

This data contains protein localization sites

Dataset Characteristics

Multivariate

Subject Area

Life

Associated Tasks

Classification

Attribute Type

Real

# Instances

336

# Attributes

8

Information

Additional Information

The references below describe a predecessor to this dataset and its development. They also give results (not cross-validated) for classification by a rule-based expert system with that version of the dataset. Reference: "Expert Sytem for Predicting Protein Localization Sites in Gram-Negative Bacteria", Kenta Nakai & Minoru Kanehisa, PROTEINS: Structure, Function, and Genetics 11:95-110, 1991. Reference: "A Knowledge Base for Predicting Protein Localization Sites in Eukaryotic Cells", Kenta Nakai & Minoru Kanehisa, Genomics 14:897-911, 1992.

Has Missing Values

Symbol: 0

Attribute Information

Additional Information

1. Sequence Name: Accession number for the SWISS-PROT database 2. mcg: McGeoch's method for signal sequence recognition. 3. gvh: von Heijne's method for signal sequence recognition. 4. lip: von Heijne's Signal Peptidase II consensus sequence score. Binary attribute. 5. chg: Presence of charge on N-terminus of predicted lipoproteins. Binary attribute. 6. aac: score of discriminant analysis of the amino acid content of outer membrane and periplasmic proteins. 7. alm1: score of the ALOM membrane spanning region prediction program. 8. alm2: score of ALOM program after excluding putative cleavable signal regions from the sequence.

Features

Attribute NameRoleTypeDescriptionUnitsMissing Values
false
false
false
false
false
false
false
false

0 to 8 of 8

Papers Citing this Dataset

INCREMENTAL SEMI -SUPERVISED CLUSTERING METHOD USING NEIGHBOURHOOD ASSIGNMENT

By P GaneshKumar, Siva A.P. 2016

Published in International Journal of Computer Science, Engineering and Applications.

Spectral M-estimation with Applications to Hidden Markov Models

By Dustin Tran, Minjae Kim, Finale Doshi-Velez. 2016

Published in ArXiv.

Improving Classification Accuracy Using Gene Ontology Information

By Ying Shen, Lin Zhang. 2013

Published in ICIC.

Decision Trees Using the Minimum Entropy-of-Error Principle

By Joaquim Sá, João Gama, Raquel Sebastião, Luís Alexandre. 2009

Published in CAIP.

Researching on Multi-net Systems Based on Stacked Generalization

By Carlos Hernández-Espinosa, Joaquín Torres-Sospedra, Mercedes Fernández-Redondo. 2008

Published in ANNPR.

0 to 5 of 5

Download
5 citations
17670 views

Creators

Kenta Nakai

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy