A New Hybrid Wavelet Decomposition-based Networks for Script Identification in Scene Images

Authors

  • Shivakumara Palaiahnakote School of Science, Engineering and Environment, University of Salford, UK https://orcid.org/0000-0001-9026-4613
  • Umapada Pal Computer Vision and Pattern Recognition, Indian Statistical Institute, India
  • Taha Mansouri School of Science, Engineering and Environment, University of Salford, UK

DOI:

https://doi.org/10.47852/bonviewAIA52023569

Keywords:

deep learning, dense network, inception architecture, attention model, scence text, script identification

Abstract

Script identification is challenging because of the unpredictable nature of the scene text. This paper presents a new model for achieving accurate script identification irrespective of intra and inter-class variations. The distinct features that represent the scene text of different scripts uniquely are extracted by fusing inception, which captures multi-scale features, and dense network, which captures fine-grained features. To strengthen the feature extraction, the proposed work uses wavelet decomposition, which enhances the fine details like edges in the images. Furthermore, for extracting text style, we propose a soft style attention module, which captures the unique style of scene text. The above modules are integrated as a hybrid model for accurate script identification. To evaluate the proposed model, we conducted comprehensive experiments on benchmark datasets, namely CVSI2015, SIW-13, and MLe2e, and combined datasets (combining distinct classes of all three benchmark datasets). The results of the proposed model on different datasets show that the performance is superior to the state-of-the-art methods in terms of accuracy.

 

Received: 6 June 2024 | Revised: 12 November 2024 | Accepted: 31 December 2024

 

Conflicts of Interest

Palaiahnakote Shivakumara is the Editor-in-Chief and Umapada Pal is an Advisory Board Member for Artificial Intelligence and Applications, and were not involved in the editorial review or the decision to publish this article. The authors declare that they have no conflicts of interest to this work.

 

Data Availability Statement

The data that support the findings of this study are openly available in CVSI 2025 at https://www.ict.griffith.edu.au/cvsi2015/Dataset.php, in GitHub at https://github.com/lluisgomez/scri pt_identification, and in Kaggle at https://www.kaggle.com/datase ts/ayush02102001/cvsi-script-identification-dataset.

 

Author Contribution Statement

Shivakumara Palaiahankote: Methodology, Writing – original draft. Umapada Pal: Writing – review & editing, Visualization, Supervision. Taha Mansouri: Validation, Resources.


Metrics

Metrics Loading ...

Downloads

Published

2025-01-24

Issue

Section

Online First Articles

How to Cite

Palaiahnakote, S., Pal, U., & Mansouri, T. (2025). A New Hybrid Wavelet Decomposition-based Networks for Script Identification in Scene Images. Artificial Intelligence and Applications. https://doi.org/10.47852/bonviewAIA52023569