Forty Soybean Cultivars from Subsequent Harvests

Donated on 10/28/2023

Soybean cultivation is one of the most important because it is used in several segments of the food industry. The evaluation of soybean cultivars subject to different planting and harvesting characteristics is an ongoing field of research. We present a dataset obtained from forty soybean cultivars planted in subsequent seasons. The experiment used randomized blocks, arranged in a split-plot scheme, with four replications. The following variables were collected: plant height, insertion of the first pod, number of stems, number of legumes per plant, number of grains per pod, thousand seed weight, and grain yield, resulting in 320 data samples. The dataset presented can be used by researchers from different fields of activity.

Dataset Characteristics


Subject Area


Associated Tasks

Classification, Regression, Clustering, Other

Feature Type

Real, Categorical, Integer

# Instances


# Features


Dataset Information

For what purpose was the dataset created?

To study soybean cultivars harvested in subsequent seasons.

Who funded the creation of the dataset?

There was no cash financing, but support for carrying out the experiments by Accert Pesquisa e Consultoria Agronomia, located in Balsas, Maranhão, Brazil.

What do the instances in this dataset represent?

The average values of 10 plants per plot at harvest (phase R8).

Are there recommended data splits?

We recommend that stratified cross-validation be applied, so that the same cultivar does not appear in the training and test sets simultaneously.

Does the dataset contain data that might be considered sensitive in any way?

No data is confidential

Was there any data preprocessing performed?

The data presented is raw data

Additional Information

More details about the dataset can be found in the published article:

Has Missing Values?


Introductory Paper

Forty soybean cultivars from subsequent harvests

By Bruno Rodrigues de Oliveira, Alan Mario Zuffo, Francisco Charles dos Santos Silva, Ricardo Mezzomo, Leandra Matos Barrozo, Tatiane Scilewski da Costa Zanatta, Joel Cabral dos Santos, Carlos Henrique Conceição Sousa, Yago Pinto Coelho. 2023

Published in Balsas, MA, Brazil

Variables Table

Variable NameRoleTypeDescriptionUnitsMissing Values
SeasonFeatureInteger1 or 2no
CultivarFeatureCategoricalCultivar namesno
RepetitionFeatureInteger1, 2, 3 or 4no
PHFeatureContinuousplant height (cm) – determined from the soil surface to the insertion of the last leaf using a millimeter ruler;cmno
IFPFeatureContinuousinsertion of the first pod (cm) – determined from the soil surface to the insertion of the first vegetable;cmno
NLPFeatureContinuousNumber of stems (unit) – through manual counting;no
NGPFeatureContinuousNumber of legumes per plant (unit) – through manual counting;no
NGLFeatureContinuousNumber of grains per plant (unit) – through manual counting;no
NSFeatureContinuousNumber of grains per pod (unit) – through manual counting;no
MHGFeatureContinuousThousand seed weight (g) – according to the methodology described in Brasil (2009);gno

0 to 10 of 11


There are no reviews for this dataset yet.

Login to Write a Review
1 citations


Bruno Rodrigues de Oliveira

Editora Pantanal

Alan Mario Zuffo

State University of Maranhão


By using the UCI Machine Learning Repository, you acknowledge and accept the cookies and privacy practices used by the UCI Machine Learning Repository.

Read Policy