Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

× Check out the beta version of the new UCI Machine Learning Repository we are currently testing! Contact us if you have any issues, questions, or concerns. Click here to try out the new site.

Simulated data for survival modelling Data Set
Download: Data Folder, Data Set Description

Abstract: A variety of survival data, with carefully controlled event and censor rates, is available to allow people to develop and test new approaches to survival modelling.

Data Set Characteristics:  

Multivariate, Time-Series

Number of Instances:

120000

Area:

Life

Attribute Characteristics:

Integer, Real

Number of Attributes:

25

Date Donated

2018-12-04

Associated Tasks:

Regression

Missing Values?

Yes

Number of Web Hits:

8805


Source:

Creators:
Ruikin Cao, Fei Gao, Dimitris Kontogouris, Pavel Kroupa, Zongchun Li, Yuxiang Wu
Department of Computing, Imperial College London, London, UK

Matt Williams & Kerlann Le Calvez
Computational Oncology Laboratory, Imperial College London, London, UK
Imperial College Healthcare NHS Trust, London, UK

Donors:
Kerlann Le Calvez & Matt Williams
Computational Oncology Laboratory, Imperial College London, London, UK
Imperial College Healthcare NHS Trust, London, UK


Data Set Information:

We generated two batches of data, where each batch consists of 20 datasets.
For the low dimensional batch, we used 5 predictive parameters, of which 2 were dummy parameters (i.e. had no impact) and three were predictive.
For the medium dimension batch, we used 25 predictors, of which 2 were dummy and 23 predictive.
In each batch, we varied the event rate from 10% to 70% and the censor rate from 0% to 70% in 20% steps, and used a set population size of 3000.
This therefore led to two batches, each of 20 datasets of 3000 subjects.


Attribute Information:

For the low dimensional batch:
x.0 & Binary: with equal probabilities
x.1 & Gaussian: μ = 50, σ = 15
x.2 & Uniform: [1,2,3,4]
x.3 & Binary: 0.6 chance of 0
x.4 & Uniform: [1,2,3]

For the medium dimension batch:
x.0 & Binary: with equal probabilities
x.1 & Gaussian: μ = 50, σ = 15
x.2 & Uniform: [1,2,3,4]
x.3 & Binary: 0.6 chance of 0
x.4 & Uniform: [1,2,3]
x.5 & Binary: 0.95 chance of 0
x.6 & Binary: 0.9 chance of 0
x.7 & Binary: 0.85 chance of 0
x.8 & Binary: 0.8 chance of 0
x.9 & Binary: 0.75 chance of 0
x.10 & Binary: 0.7 chance of 0
x.11 & Binary: 0.65 chance of 0
x.12 & Binary: 0.6 chance of 0
x.13 & Binary: 0.55 chance of 0
x.14 & Binary: 0.5 chance of 0
x.15 & Binary: 0.5 chance of 0
x.16 & Binary: 0.45 chance of 0
x.17 & Binary: 0.4 chance of 0
x.18 & Binary: 0.35 chance of 0
x.19 & Binary: 0.3 chance of 0
x.20 & Binary: 0.25 chance of 0
x.21 & Binary: 0.2 chance of 0
x.22 & Binary: 0.15 chance of 0
x.23 & Binary: 0.1 chance of 0
x.24 & Binary: 0.05 chance of 0


Relevant Papers:

A submission to Nature is in progress



Citation Request:

Please cite the associated paper when using this dataset.


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML