A New Hybrid Wavelet Decomposition-based Networks for Script Identification in Scene Images
DOI:
https://doi.org/10.47852/bonviewAIA52023569Keywords:
deep learning, dense network, inception architecture, attention model, scence text, script identificationAbstract
Script identification is challenging because of the unpredictable nature of the scene text. This paper presents a new model for achieving accurate script identification irrespective of intra and inter-class variations. The distinct features that represent the scene text of different scripts uniquely are extracted by fusing inception, which captures multi-scale features, and dense network, which captures fine-grained features. To strengthen the feature extraction, the proposed work uses wavelet decomposition, which enhances the fine details like edges in the images. Furthermore, for extracting text style, we propose a soft style attention module, which captures the unique style of scene text. The above modules are integrated as a hybrid model for accurate script identification. To evaluate the proposed model, we conducted comprehensive experiments on benchmark datasets, namely CVSI2015, SIW-13, and MLe2e, and combined datasets (combining distinct classes of all three benchmark datasets). The results of the proposed model on different datasets show that the performance is superior to the state-of-the-art methods in terms of accuracy.
Received: 6 June 2024 | Revised: 12 November 2024 | Accepted: 31 December 2024
Conflicts of Interest
Palaiahnakote Shivakumara is the Editor-in-Chief and Umapada Pal is an Advisory Board Member for Artificial Intelligence and Applications, and were not involved in the editorial review or the decision to publish this article. The authors declare that they have no conflicts of interest to this work.
Data Availability Statement
The data that support the findings of this study are openly available in CVSI 2025 at https://www.ict.griffith.edu.au/cvsi2015/Dataset.php, in GitHub at https://github.com/lluisgomez/scri pt_identification, and in Kaggle at https://www.kaggle.com/datase ts/ayush02102001/cvsi-script-identification-dataset.
Author Contribution Statement
Shivakumara Palaiahankote: Methodology, Writing – original draft. Umapada Pal: Writing – review & editing, Visualization, Supervision. Taha Mansouri: Validation, Resources.
Metrics
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.