gene expression cancer RNA-Seq
Donated on 6/8/2016
This collection of data is part of the RNA-Seq (HiSeq) PANCAN data set, it is a random extraction of gene expressions of patients having different types of tumor: BRCA, KIRC, COAD, LUAD and PRAD.
Dataset Characteristics
Multivariate
Subject Area
Biology
Associated Tasks
Classification, Clustering
Feature Type
Real
# Instances
801
# Features
20531
Dataset Information
Additional Information
Samples (instances) are stored row-wise. Variables (attributes) of each sample are RNA-Seq gene expression levels measured by illumina HiSeq platform.
Has Missing Values?
No
Variable Information
A dummy name (gene_XX) is given to each attribute. Check the original submission (https://www.synapse.org/#!Synapse:syn4301332), or the platform specs for the complete list of probes name. The attributes are ordered consitently with the original submission.
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset gene_expression_cancer_rna_seq = fetch_ucirepo(id=401) # data (as pandas dataframes) X = gene_expression_cancer_rna_seq.data.features y = gene_expression_cancer_rna_seq.data.targets # metadata print(gene_expression_cancer_rna_seq.metadata) # variable information print(gene_expression_cancer_rna_seq.variables)
Fiorini,Samuele. (2016). gene expression cancer RNA-Seq. UCI Machine Learning Repository. https://doi.org/10.24432/C5R88H.
@misc{misc_gene_expression_cancer_rna-seq_401, author = {Fiorini,Samuele}, title = {{gene expression cancer RNA-Seq}}, year = {2016}, howpublished = {UCI Machine Learning Repository}, note = {{DOI}: https://doi.org/10.24432/C5R88H} }
Creators
Samuele Fiorini
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.