Center for Machine Learning and Intelligent Systems
About  Citation Policy  Donate a Data Set  Contact


Repository Web            Google
View ALL Data Sets

3W dataset Data Set
Download: Data Folder, Data Set Description

Abstract: The first realistic and public dataset with rare undesirable real events in oil wells.

Data Set Characteristics:  

Multivariate, Time-Series

Number of Instances:

1984

Area:

Computer

Attribute Characteristics:

Integer, Real

Number of Attributes:

8

Date Donated

2019-08-15

Associated Tasks:

Classification, Clustering

Missing Values?

Yes

Number of Web Hits:

518


Source:

Ricardo Emanuel Vaz Vargas (ricardo.vargas '@' petrobras.com.br) a, b
Celso José Munaro (munaro '@' ele.ufes.br) a
Patrick Marques Ciarelli (patrick.ciarelli '@' ufes.br) a
André Gonçalves Medeiros (andremedeiros '@' petrobras.com.br) b
Bruno Guberfain do Amaral (bruno.do.amaral '@' petrobras.com.br) c
Daniel Centurion Barrionuevo (dcbarrionuevo '@' petrobras.com.br) d
Jean Carlos Dias de Araújo (jeanaraujo '@' petrobras.com.br) b
Jorge Lins Ribeiro (jorge_lribeiro '@' petrobras.com.br) b
Lucas Pierezan Magalhães (lucas.magalhaes '@' petrobras.com.br) c
---
a Departamento de Engenharia Elétrica, Universidade Federal do Espírito Santo, Av. Fernando Ferrari, 514, Goiabeiras, Vitória - ES - Brasil, CEP: 29060-370.
b Petróleo Brasileiro S.A., Av. Nossa Sra. da Penha, 1688, Barro Vermelho, Vitória - ES - Brasil, CEP: 29057-570.
c Petróleo Brasileiro S.A., Rua Ulysses Guimarães, 565, Cidade Nova, Rio de Janeiro – RJ - Brasil, CEP: 20211-160.
d Petróleo Brasileiro S.A., Cidade Universitária, Rio de Janeiro - RJ - Brasil, CEP: 21941-970.


Data Set Information:

To the best of its authors' knowledge, this is the first realistic and public dataset with rare undesirable real events in oil wells that can be readily used as a benchmark dataset for development of machine learning techniques related to inherent difficulties of actual data.

More information about the theory behind this dataset is available in the paper 'A realistic and public dataset with rare undesirable real events in oil wells' published in the Journal of Petroleum Science and Engineering ([Web Link]). Specific challenges (benchmarks) that practitioners and researchers can use together with the 3W dataset are defined and proposed in this paper.

The 3W dataset consists of 1,984 CSV files structured as follows. Due to the limitation of GitHub, this dataset is kept in 7z files splitted automatically and saved in the data directory. Before using 3W dataset, they must be decompressed. After that, the subdirectory names are the instances' labels. Each file represents one instance. The filename reveals its source. All files are standardized as follow. There are one observation per line and one series per column. Columns are separated by commas and decimals are separated by periods. The first column contains timestamps, the last one reveals the observations' labels, and the other columns are the Multivariate Time Series (MTS) (i.e. the instance itself).

The 3W dataset's files are in [Web Link], but we believe that the 3W dataset's publication in the UCI Machine Learning Repository benefits the machine learning community.


Attribute Information:

Pressure at the Permanent Downhole Gauge (PDG);
Pressure at the Temperature and Pressure Transducer (TPT);
Temperature at the TPT;
Pressure upstream of the Production Choke (PCK);
Temperature downstream of the PCK;
Pressure downstream of the Gas Lift Choke (GLCK);
Temperature downstream of the GLCK;
Gas Lift flow.


Relevant Papers:

'A realistic and public dataset with rare undesirable real events in oil wells' published in the Journal of Petroleum Science and Engineering ([Web Link]).



Citation Request:

If you have no special citation requests, please leave this field blank.


Supported By:

 In Collaboration With:

About  ||  Citation Policy  ||  Donation Policy  ||  Contact  ||  CML