Water Quality Prediction
Donated on 5/14/2022
Here we want to forecast the spatio-temporal water quality in terms of the “power of hydrogen (pH)†value for the next day based on the input data, which is the historical data of other water measurement indices. The input data consists of daily samples for 36 sites, providing measurements related to pH values in Georgia, USA. The input features consist of 11 common indices including volume of dissolved oxygen, temperature, and specific conductance (see details in dataset). The output to predict is the measurement of 'pH, water, unfiltered, field, standard units (Median)'. There are two major water systems to consider: one is centered on the city of Atlanta while the other is centered on the eastern coast of Georgia. This information indicates spatial dependency among different locations which are important to the forecast. For details of the data description, please refer to the file named README.docx. 'Specific conductance, water, unfiltered, microsiemens per centimeter at 25 degrees Celsius (Maximum)' 'pH, water, unfiltered, field, standard units (Maximum)' 'pH, water, unfiltered, field, standard units (Minimum)' 'Specific conductance, water, unfiltered, microsiemens per centimeter at 25 degrees Celsius (Minimum)' 'Specific conductance, water, unfiltered, microsiemens per centimeter at 25 degrees Celsius (Mean)' 'Dissolved oxygen, water, unfiltered, milligrams per liter (Maximum)' 'Dissolved oxygen, water, unfiltered, milligrams per liter (Mean)' 'Dissolved oxygen, water, unfiltered, milligrams per liter (Minimum)' 'Temperature, water, degrees Celsius (Mean)' 'Temperature, water, degrees Celsius (Minimum)' 'Temperature, water, degrees Celsius (Maximum)'
Dataset Characteristics
Other
Subject Area
Computer Science
Associated Tasks
Regression
Feature Type
-
# Instances
705
# Features
-
Dataset Information
For what purpose was the dataset created?
The goal is to predict the spatio-temporal water quality in terms of the power of hydrogen (pH) value for the next day based on the historical data of water measurement indices.
Who funded the creation of the dataset?
National Science Foundation
What do the instances in this dataset represent?
For each instance, The input features consist of 11 common indices including volume of dissolved oxygen, temperature, and specific conductance (see details in dataset). The output to predict is the measurement of 'pH, water, unfiltered, field, standard units (Median)'.
Has Missing Values?
No
Introductory Paper
By Liang Zhao, Olga Gkountouna, D. Pfoser. 2019
Published in ACM Trans. Spatial Algorithms Syst.
Dataset Files
File | Size |
---|---|
water_dataset.mat | 984.4 KB |
README.docx | 16.1 KB |
__MACOSX/._water_dataset.mat | 588 Bytes |
__MACOSX/._README.docx | 548 Bytes |
Reviews
There are no reviews for this dataset yet.
pip install ucimlrepo
from ucimlrepo import fetch_ucirepo # fetch dataset water_quality_prediction = fetch_ucirepo(id=733) # data (as pandas dataframes) X = water_quality_prediction.data.features y = water_quality_prediction.data.targets # metadata print(water_quality_prediction.metadata) # variable information print(water_quality_prediction.variables)
Zhao, L. (2019). Water Quality Prediction [Dataset]. UCI Machine Learning Repository. https://doi.org/10.1145/3339823.
Creators
Liang Zhao
liang.zhao@emory.edu
DOI
License
This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
This allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given.