3W dataset

Donated on 8/14/2019

The first realistic and public dataset with rare undesirable real events in oil wells.

Dataset Characteristics

Multivariate, Time-Series

Subject Area

Computer Science

Associated Tasks

Classification, Clustering

Feature Type

Integer, Real

# Instances


# Features


Dataset Information

Additional Information

To the best of its authors' knowledge, this is the first realistic and public dataset with rare undesirable real events in oil wells that can be readily used as a benchmark dataset for development of machine learning techniques related to inherent difficulties of actual data. More information about the theory behind this dataset is available in the paper 'A realistic and public dataset with rare undesirable real events in oil wells' published in the Journal of Petroleum Science and Engineering (https://doi.org/10.1016/j.petrol.2019.106223). Specific challenges (benchmarks) that practitioners and researchers can use together with the 3W dataset are defined and proposed in this paper. The 3W dataset consists of 1,984 CSV files structured as follows. Due to the limitation of GitHub, this dataset is kept in 7z files splitted automatically and saved in the data directory. Before using 3W dataset, they must be decompressed. After that, the subdirectory names are the instances' labels. Each file represents one instance. The filename reveals its source. All files are standardized as follow. There are one observation per line and one series per column. Columns are separated by commas and decimals are separated by periods. The first column contains timestamps, the last one reveals the observations' labels, and the other columns are the Multivariate Time Series (MTS) (i.e. the instance itself). The 3W dataset's files are in https://github.com/ricardovvargas/3w_dataset, but we believe that the 3W dataset's publication in the UCI Machine Learning Repository benefits the machine learning community.

Has Missing Values?


Variable Information

Pressure at the Permanent Downhole Gauge (PDG); Pressure at the Temperature and Pressure Transducer (TPT); Temperature at the TPT; Pressure upstream of the Production Choke (PCK); Temperature downstream of the PCK; Pressure downstream of the Gas Lift Choke (GLCK); Temperature downstream of the GLCK; Gas Lift flow.

0 citations


By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy