3W dataset
Donated on 8/14/2019
The first realistic and public dataset with rare undesirable real events in oil wells.
Dataset Characteristics
Multivariate, Time-Series
Subject Area
Computer Science
Associated Tasks
Classification, Clustering
Feature Type
Integer, Real
# Instances
1984
# Features
8
Dataset Information
Additional Information
To the best of its authors' knowledge, this is the first realistic and public dataset with rare undesirable real events in oil wells that can be readily used as a benchmark dataset for development of machine learning techniques related to inherent difficulties of actual data. More information about the theory behind this dataset is available in the paper 'A realistic and public dataset with rare undesirable real events in oil wells' published in the Journal of Petroleum Science and Engineering (https://doi.org/10.1016/j.petrol.2019.106223). Specific challenges (benchmarks) that practitioners and researchers can use together with the 3W dataset are defined and proposed in this paper. The 3W dataset consists of 1,984 CSV files structured as follows. Due to the limitation of GitHub, this dataset is kept in 7z files splitted automatically and saved in the data directory. Before using 3W dataset, they must be decompressed. After that, the subdirectory names are the instances' labels. Each file represents one instance. The filename reveals its source. All files are standardized as follow. There are one observation per line and one series per column. Columns are separated by commas and decimals are separated by periods. The first column contains timestamps, the last one reveals the observations' labels, and the other columns are the Multivariate Time Series (MTS) (i.e. the instance itself). The 3W dataset's files are in https://github.com/ricardovvargas/3w_dataset, but we believe that the 3W dataset's publication in the UCI Machine Learning Repository benefits the machine learning community.
Has Missing Values?
Yes
Variable Information
Pressure at the Permanent Downhole Gauge (PDG); Pressure at the Temperature and Pressure Transducer (TPT); Temperature at the TPT; Pressure upstream of the Production Choke (PCK); Temperature downstream of the PCK; Pressure downstream of the Gas Lift Choke (GLCK); Temperature downstream of the GLCK; Gas Lift flow.
Dataset Files
File | Size |
---|---|
readme.txt | 127 Bytes |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset 3w_dataset = fetch_ucirepo(id=540) # data (as pandas dataframes) X = 3w_dataset.data.features y = 3w_dataset.data.targets # metadata print(3w_dataset.metadata) # variable information print(3w_dataset.variables)
3W dataset [Dataset]. (2019). UCI Machine Learning Repository. https://doi.org/10.24432/C54W4M.
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.