An Asynchronous Parallel Data Loading Optimization Algorithm for Deep Learning Applications

Authors

DOI:

https://doi.org/10.47852/bonviewJDSIS62027528

Keywords:

asynchronous data loading, deep learning, data distribution invariance, optimization algorithm, concurrency

Abstract

This study aims to address the concern of inefficient data loading, which often leads to computationally inefficient deep learning workflows and becomes a bottleneck for scalability, especially under resource constraints. An asynchronous parallel data loading optimization algorithm is proposed that will revolutionize the data-training pipeline by enabling multi-threaded concurrent data loading and model training on multiple devices. The two-dimensional array structure and special hash table used ensure the invariance of the data distribution and concurrency safety, which is independent of loading and training processes and supported by rigorous mathematical proof. Experimental results from the CIFAR-10 dataset vividly demonstrate that this method represents a significant improvement over state-of-the-art baselines, achieving a throughput of approximately 3,250 samples/second, 87% GPU utilization, and a 40% reduction in training time, while maintaining the statistical integrity of the original dataset. This paper proposes a solution that does not bind users to a specific framework and increases efficiency without requiring the purchase of expensive hardware. This resolution makes deep learning achievable and reproducible, while reducing the time cost of research and student training exercises.

 

Received: 31 August 2025 | Revised: 10 February 2026 | Accepted: 30 March 2026

 

Conflicts of Interest

The author declares that he has no conflicts of interest to this work.

 

Data Availability Statement

The data that support the findings of this study are openly available at https://www.cs.toronto.edu/~kriz/cifar.html.

 

Author Contribution Statement

Xingjun Lin: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization, Supervision, Project administration, Funding acquisition.

Downloads

Published

2026-05-29

Issue

Section

Research Articles

How to Cite

Lin, X. (2026). An Asynchronous Parallel Data Loading Optimization Algorithm for Deep Learning Applications. Journal of Data Science and Intelligent Systems. https://doi.org/10.47852/bonviewJDSIS62027528