Enhanced Multimodal Webpage Classification Using Deep Learning for Efficient Information Retrieval

Manjunath Pujar; Monica Mundada; Sowmya B.  J.; Supreeth Shivashankar; Ganesh Dalappagari Ramanjinappa; Shambulingana Gouda

doi:10.47852/bonviewJCCE52025256

Authors

Manjunath Pujar Department of Computer Science and Engineering, M.S. Ramaiah Institute of Technology, India https://orcid.org/0000-0003-1437-7215
Monica Mundada Department of Computer Science and Engineering, M.S. Ramaiah Institute of Technology, India https://orcid.org/0000-0001-7262-8997
Sowmya B. J. Department of Artificial Intelligence and Data Science, M.S. Ramaiah Institute of Technology, India https://orcid.org/0000-0003-2080-8325
Supreeth Shivashankar School of Computer Science and Engineering, REVA University, India https://orcid.org/0000-0002-7097-6733
Ganesh Dalappagari Ramanjinappa School of Computing and Information Technology, REVA University, India https://orcid.org/0000-0003-2627-4918
Shambulingana Gouda Electrical and Electronics Engineering, Rao Bahadaur Y. Mahabaleshwarappa Engineering College, India https://orcid.org/0009-0006-6287-1532

DOI:

https://doi.org/10.47852/bonviewJCCE52025256

Keywords:

classifying web content, deep learning, efficient information retrieval, extracting key frames, Logistic Sigmoid Long Short-Term Memory model

Abstract

Web data mining has become a crucial tool for efficiently retrieving valuable information, as users increasingly rely on the World Wide Web for data exchange. Traditional web classification methods often struggle with handling multimodal data, leading to challenges in accurately classifying diverse web contents. Online classification plays a key role in facilitating efficient retrieval of information from multimedia content. This study presents a novel multimodal approach for webpage classification by integrating deep learning techniques for audio-visual analysis. The personalized Long Short-Term Memory (LS) TM model, which is a specific version of Long Short-Term Memory (LSTM), has improved classification accuracy by combining deep audio and video features. Artificial Convolutional Neural Networks (A-CNNs) extract complex audio features, while transformer networks capture long-range dependencies from video data. The present study proposes a log-sigmoid activation function that provides a more flexible thresholding method in logistic regression, thus greatly improving the classification performance. The focus of this study, which is on single-modality classification, presents an innovative method of integrating deep learning-based multimodal fusion, thus setting a new standard for web classification. Experimental results show that the Logistic Sigmoid Long Short-Term Memory ((LS)²TM) model has an accuracy of 88.09%, sensitivity of 89.14%, and specificity of 89.01%, outperforming state-of-the-art techniques such as LSTM, Deep Belief Network DBN, and A-CNN. The model also enhanced its precision (93.15%), recall (92.84%), and F-measure (93.25%), which are generally 5% higher than classical control methods. These findings highlight the potential of (LS)²TM for improving web content mining through multimodal analysis. Future research should focus on the real-world validation and scalability of dynamic web environments.

Received: 19 January 2025 | Revised: 18 March 2025 | Accepted: 7 May 2025

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

Data Availability Statement

The data that support the findings of this study are openly available at https://www.kaggle.com/datasets/shaurov/website classificationusing-url. The data that support the findings of this study are openly available at https://doi.org/10.1109/78.650093, reference number [12].

Author Contribution Statement

Manjunath Pujar: Methodology, Software, Validation, Resources, Writing – original draft. Monica Mundada: Conceptualization, Formal analysis, Writing – review & editing, Project administration. Sowmya B. J.: Methodology, Software, Validation, Investigation, Data curation. Supreeth Shivashankar: Writing – review & editing, Supervision, Project administration. Ganesh Dalappagari Ramanjinappa: Data curation, Visualization. Shambulingana Gouda: Visualization, Supervision.