TCGA Kidney Cancers

External

Linked on 9/25/2023

The TCGA Kidney Cancers Dataset is a bulk RNA-seq dataset that contains transcriptome profiles of patients diagnosed with three different subtypes of kidney cancers. This dataset can be used to make predictions about the specific subtype of kidney cancers given the normalized transcriptome profile data, as well as providing a hands-on experience on large and sparse genomic information.

Dataset Characteristics

Tabular, Multivariate

Subject Area

Health and Medicine

Associated Tasks

Classification, Clustering

Feature Type

Real

# Instances

1024

# Features

60660

Dataset Information

For what purpose was the dataset created?

To better understand the relationship between human genome and cancers

Who funded the creation of the dataset?

The NIH.

What do the instances in this dataset represent?

- Bulk transcriptome profiles - Kidney cancer patients - Worldwide population

Are there recommended data splits?

Cross validation or a fixed train-test split could be used.

Does the dataset contain data that might be considered sensitive in any way?

This dataset contains the variables age, race, and ethnicity.

Was there any data preprocessing performed?

Fragments Per Kilo Million (FPKM) normalization.

Has Missing Values?

No

Introductory Paper

The Cancer Genome Atlas Pan-Cancer analysis project

By J. Weinstein, E. Collisson, G. Mills, K. Shaw, B. Ozenberger, K. Ellrott, I. Shmulevich, C. Sander, Joshua M. Stuart. 2013

Published in Nature Genetics

Variable Information

Bulk RNA-Seq normalized using FPKM (fragments per kilo million) method

Class Labels

- TCGA-KICH - TCGA-KIRC - TCGA-KIRP

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Dataset Home Page
1 citations
18541 views

Citations/Acknowledgements

If you use this dataset, please follow the acknowledgment policy on the original dataset website.

Keywords

GenomicsKidneycancerRNA-Seq

Creators

J. Weinstein

E. Collisson

G. Mills

K. Shaw

B. Ozenberger

K. Ellrott

I. Shmulevich

C. Sander

Joshua M.

Notes

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy