Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

× Check out the beta version of the new UCI Machine Learning Repository we are currently testing! Contact us if you have any issues, questions, or concerns. Click here to try out the new site.

Anticancer peptides Data Set
Download: Data Folder, Data Set Description

Abstract: Peptides with experimental annotations on their anticancer action on breast and lung cancer cells.

Data Set Characteristics:  

Sequential

Number of Instances:

1850

Area:

Life

Attribute Characteristics:

N/A

Number of Attributes:

2

Date Donated

2019-10-11

Associated Tasks:

Classification

Missing Values?

N/A

Number of Web Hits:

11834


Source:

Francesca Grisoni, Claudia S. Neuhaus, Miyabi Hishinuma, Gisela Gabernet, Jan A. Hiss, Masaaki Kotera, Gisbert Schneider
contact: Francesca Grisoni, ETH Zurich, francesca.grisoni '@' pharma.ethz.ch


Data Set Information:

Membranolytic anticancer peptides (ACPs) are drawing increasing attention as potential future therapeutics against cancer, due to their ability to hinder the development of cellular resistance and their potential to overcome common hurdles of chemotherapy, e.g., side effects and cytotoxicity.
This dataset contains information on peptides (annotated for their one-letter amino acid code) and their anticancer activity on breast and lung cancer cell lines.
Two peptide datasets targeting breast and lung cancer cells were assembled and curated manually from CancerPPD. EC50, IC50, LD50 and LC50 annotations on breast and lung cancer cells were retained (breast cell lines: MCF7 = 57%, MDA-MB-361 = 11%, MT-1 = 9%; lung cell lines: H-1299 = 45%, A-549 = 17.7%); mg ml−1 values were converted to μM units. Linear and l-chiral peptides were retained, while cyclic, mixed or d-chiral peptides were discarded. In the presence of both amidated and non-amidated data for the same sequence, only the value referred to the amidated peptide was retained. Peptides were split into three classes for model training: (1) very active (EC/IC/LD/LC50 ≤ 5 μM), (2) moderately active (EC/IC/LD/LC50 values up to 50 μM) and (3) inactive (EC/IC/LD/LC50 > 50 μM) peptides. Duplicates with conflicting class annotations were compared manually to the original sources, and, if necessary, corrected. If multiple class annotations were present for the same sequence, the most frequently represented class was chosen; in case of ties, the less active class was chosen. Since the CancerPPD is biased towards the annotation of active peptides, we built a set of presumably inactive peptides by randomly extracting 750 alpha-helical sequences from crystal structures deposited in the Protein Data Bank (7–30 amino acids). The final training sets contained 949 peptides for Breast cancer and 901 peptides for Lung cancer.
The datasets were used to develop neural networks model for anticancer peptide design and are provided as .csv file in a .zip folder.

Additional details can be found in: Grisoni, F., Neuhaus, C.S., Hishinuma, M., Gabernet, G., Hiss, J.A., Kotera, M. and Schneider, G., 2019. De novo design of anticancer peptides by ensemble artificial neural networks. Journal of Molecular Modeling, 25(5), 112.


Attribute Information:

The dataset contains three attributes:
1. Peptide ID
2. One-letter amino-acid sequence
3. Class (active, moderately active, experimental inactive, virtual inactive)


Relevant Papers:

Provide references to papers that have cited this data set in the past (if any).



Citation Request:

Please cite the following paper when publishing results obtained with all or part of the provided data:
Grisoni, F., Neuhaus, C.S., Hishinuma, M., Gabernet, G., Hiss, J.A., Kotera, M. and Schneider, G., 2019. De novo design of anticancer peptides by ensemble artificial neural networks. Journal of Molecular Modeling, 25(5), 112. [Web Link]

BibTeX format:
@Article{Grisoni2019,
author='Grisoni, Francesca
and Neuhaus, Claudia S.
and Hishinuma, Miyabi
and Gabernet, Gisela
and Hiss, Jan A.
and Kotera, Masaaki
and Schneider, Gisbert',
title='De novo design of anticancer peptides by ensemble artificial neural networks',
journal='Journal of Molecular Modeling',
year='2019',
month='Apr',
day='05',
volume='25',
number='5',
pages='112',
issn='0948-5023',
doi='10.1007/s00894-019-4007-6',
url='[Web Link]'
}


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML