Multi-Source Data Fusion and Machine Learning for Soybean Crop Price Forecasting in India

Vilas Damodhar Ghonge; Yogesh Kulkarni

doi:10.47852/bonviewJCCE62027619

Authors

Vilas Damodhar Ghonge Department of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, India https://orcid.org/0009-0004-8168-4973
Yogesh Kulkarni Department of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, India

DOI:

https://doi.org/10.47852/bonviewJCCE62027619

Keywords:

soybean price prediction, time-series forecasting, agricultural market, Indian agriculture, machine learning

Abstract

Soybean is a major oilseed crop in India, and its market prices have exhibited significant volatility in recent years. Such price fluctuations create serious challenges for small and medium-scale farmers. Accurate price forecasting is essential to support informed decision-making by farmers and agri-business stakeholders. This study focuses on forecasting soybean prices in the Indian market using data spanning January 2015 to June 2025. The dataset integrates multiple heterogeneous sources, including daily market data from Agmarknet, weather information from the India Meteorological Department, regional production, and trade statistics. Extensive exploratory data analysis is conducted to examine price distributions, temporal trends, regional variations, and inter-variable relationships. Several predictive approaches are evaluated, including traditional time-series models (Autoregressive Integrated Moving Average), ensemble machine learning models, and deep learning models. In addition, a hybrid Long Short-Term Memory–Gated Recurrent Unit framework (AgroNET) is proposed to effectively model complex temporal dependencies across heterogeneous data sources. Model performance is assessed using k-fold cross-validation and evaluated through root mean squared error, mean squared error, MAPE, and R² metrics in a Python-based implementation. The results demonstrate that deep learning-based models outperform conventional approaches, with AgroNET achieving the highest R² and lowest error values. To enhance model transparency, explainable artificial intelligence using the local interpretable model-agnostic explanation technique is incorporated to identify key factors influencing individual price predictions. Overall, the proposed framework offers an effective and interpretable solution for soybean price forecasting in India and supports future research on multi-source integration and real-time agricultural price prediction.

Received: 9 September 2025 | Revised: 12 January 2026 | Accepted: 11 March 2026

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

Data Availability Statement

Data are available from the corresponding author upon reasonable request.

Author Contribution Statement

Vilas Damodhar Ghonge: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization. Yogesh Kulkarni: Conceptualization, Methodology, Investigation, Resources, Supervision.

Multi-Source Data Fusion and Machine Learning for Soybean Crop Price Forecasting in India

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Journal Information

CImago Journal

Make a Submission

Keywords

Announcements