Evaluation of resampling techniques for artificial neural network based identification of promising genotypes in sugarcane (Saccharum officinarum L.) varietal trials Hasan Syed Sarfaraz, Baitha Arun1, Gangwar Lal Singh, Kumar Sanjeev2,* Agricultural Knowledge Management Unit, ICAR-Indian Institute of Sugarcane Research, Raebareli Road, PO Dilkusha, Lucknow, 226 002, India 1Division of Crop Protection, ICAR-Indian Institute of Sugarcane Research, Raebareli Road, PO Dilkusha, Lucknow, 226 002, India 2Division of Crop Improvement, ICAR-Indian Institute of Sugarcane Research, Raebareli Road, PO Dilkusha, Lucknow, 226 002, India *Corresponding Author: Sanjeev Kumar, Division of Crop Improvement, ICAR-Indian Institute of Sugarcane Research, Raebareli Road, PO Dilkusha, 226 002, Lucknow, India, E-Mail: skiisr@rediffmail.com
Online Published on 14 August, 2024. Abstract Identifying promising genotypes in varietal trials is one of many agriculture domain applications requiring an artificial neural network (ANN) implementation for intelligent decisions. However, varietal trial data for identification is usually imbalanced, posing challenges for neural network classification tasks. For example, only 33 genotypes were identified as promising in zonal varietal trials of AICRP on Sugarcane during 2016-21, against a non-promising class of 148. A neural network trained using the imbalanced class dataset tend to exhibit prediction accuracy according to the highest class of the dataset. Resampling techniques adjust the ratio between different classes, making the data more balanced. The study evaluated four resampling techniques viz. random under-sampling, random oversampling, and ensemble, SMOTE to balance varietal trial dataset to build ANN to identify promising genotypes in sugarcane. The paper describes the methodology used for building such a model using resampling techniques and then presents these approaches’ comparative performance in identifying promising genotypes. Results indicate that SMOTE and random oversampling performed well for balancing datasets for developing neural network model in comparison to no-resampling of imbalanced datasets. SMOTE outperformed all resampling techniques by achieving high precision, recall and F1 score values for both positive and negative classes. However, ensemble and random under-sampling methods did not show good results compared to SMOTE and random over-sampling. Study will be useful in developing artificial intelligence-based tools to identify promising genotypes in varietal trials of sugarcane in particular and other crops in general. Top Keywords Artificial neural network, Resampling, Sugarcane, Varietal trial, Machine learning. Top |