Paddy Dataset

Donated on 7/14/2025

Agriculture occupies a third of Earth's surface and is vital for food production. Rice, grown from paddy seeds, feeds nearly half the global population. To meet rising food demands, this study aims to enhance rice production using Machine Learning (ML) to predict factors affecting paddy growth. A Hybrid ML Model with Combined Wrapper Feature Selection (HMLCWFS) was developed to address challenges like overfitting and computational costs. Five Feature Selection (FS) methods—Backward Elimination, Stepwise Forward Selection, Feature Importance, Exhaustive FS, and Gradient Boosting—were applied. Selected features were merged using Poincaré’s formula to form a refined dataset. ML models such as Decision Tree, Random Forest, SVM, KNN, and Naive Bayes were trained and tested. The model not only forecasts yield but also recommends paddy varieties based on farmers' preferences. Results show that combined FS techniques effectively identify key factors for improving paddy productivity.

Dataset Characteristics

Tabular

Subject Area

Computer Science

Associated Tasks

Classification, Regression, Clustering

Feature Type

Categorical

# Instances

2790

# Features

45

Dataset Information

Has Missing Values?

No

Introductory Paper

A Hybrid Machine Learning Model with Combined Wrapper Feature Selection Techniques to Improve the Yield of Paddy

By Muthukumaran S, John Peter K, Dilipkumar E, Savithri S, Senbagam K. 2023

Published in International Journal of Electronics and Communication Engineering

Variables Table

Variable NameRoleTypeDescriptionUnitsMissing Values
Hectares FeatureIntegerno
AgriblockFeatureCategoricalno
VarietyFeatureCategoricalno
Soil TypesFeatureCategoricalno
Seedrate(in Kg)FeatureIntegerno
LP_Mainfield(in Tonnes)FeatureContinuousno
NurseryFeatureCategoricalno
Nursery area (Cents)FeatureIntegerno
LP_nurseryarea(in Tonnes)FeatureIntegerno
DAP_20daysFeatureIntegerno

0 to 10 of 45

Additional Variable Information

LP_nurseryarea(in Tonnes)-Manure used for Land Preparation, DAP_20days-DAP sowed for the first 20 days

Class Labels

Agriblock, Variety of Paddy, Soil Types, Type of Nursery, LP_nurseryarea(in Tonnes), DAP_20days

Dataset Files

FileSize
paddydataset.csv515.5 KB

Reviews

There are no reviews for this dataset yet.

Login to Write a Review
Download (16.2 KB)
1 citations
305 views

Keywords

Creators

Muthukumaran Subramaniyan

muthumphil11@gmail.com

License

By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy