Multi-Source Data Fusion and Machine Learning for Soybean Crop Price Forecasting in India
DOI:
https://doi.org/10.47852/bonviewJCCE62027619Keywords:
soybean price prediction, time-series forecasting, agricultural market, Indian agriculture, machine learningAbstract
Soybean is a major oilseed crop in India, and its market prices have exhibited significant volatility in recent years. Such price fluctuations create serious challenges for small and medium-scale farmers. Accurate price forecasting is essential to support informed decision-making by farmers and agri-business stakeholders. This study focuses on forecasting soybean prices in the Indian market using data spanning January 2015 to June 2025. The dataset integrates multiple heterogeneous sources, including daily market data from Agmarknet, weather information from the India Meteorological Department, regional production, and trade statistics. Extensive exploratory data analysis is conducted to examine price distributions, temporal trends, regional variations, and inter-variable relationships. Several predictive approaches are evaluated, including traditional time-series models (Autoregressive Integrated Moving Average), ensemble machine learning models, and deep learning models. In addition, a hybrid Long Short-Term Memory–Gated Recurrent Unit framework (AgroNET) is proposed to effectively model complex temporal dependencies across heterogeneous data sources. Model performance is assessed using k-fold cross-validation and evaluated through root mean squared error, mean squared error, MAPE, and R² metrics in a Python-based implementation. The results demonstrate that deep learning-based models outperform conventional approaches, with AgroNET achieving the highest R² and lowest error values. To enhance model transparency, explainable artificial intelligence using the local interpretable model-agnostic explanation technique is incorporated to identify key factors influencing individual price predictions. Overall, the proposed framework offers an effective and interpretable solution for soybean price forecasting in India and supports future research on multi-source integration and real-time agricultural price prediction.Received: 9 September 2025 | Revised: 12 January 2026 | Accepted: 11 March 2026
Conflicts of Interest
The authors declare that they have no conflicts of interest to this work.
Data Availability Statement
Data are available from the corresponding author upon reasonable request.
Author Contribution Statement
Vilas Damodhar Ghonge: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization. Yogesh Kulkarni: Conceptualization, Methodology, Investigation, Resources, Supervision.
Downloads
Published
2026-04-13
Issue
Section
Research Articles
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Ghonge, V. D., & Kulkarni, Y. (2026). Multi-Source Data Fusion and Machine Learning for Soybean Crop Price Forecasting in India. Journal of Computational and Cognitive Engineering. https://doi.org/10.47852/bonviewJCCE62027619