Enhanced Multimodal Webpage Classification Using Deep Learning for Efficient Information Retrieval
DOI:
https://doi.org/10.47852/bonviewJCCE52025256Keywords:
classifying web content, deep learning, efficient information retrieval, extracting key frames, Logistic Sigmoid Long Short-Term Memory modelAbstract
Web data mining has become a crucial tool for efficiently retrieving valuable information, as users increasingly rely on the World Wide Web for data exchange. Traditional web classification methods often struggle with handling multimodal data, leading to challenges in accurately classifying diverse web contents. Online classification plays a key role in facilitating efficient retrieval of information from multimedia content. This study presents a novel multimodal approach for webpage classification by integrating deep learning techniques for audio-visual analysis. The personalized Long Short-Term Memory (LS) TM model, which is a specific version of Long Short-Term Memory (LSTM), has improved classification accuracy by combining deep audio and video features. Artificial Convolutional Neural Networks (A-CNNs) extract complex audio features, while transformer networks capture long-range dependencies from video data. The present study proposes a log-sigmoid activation function that provides a more flexible thresholding method in logistic regression, thus greatly improving the classification performance. The focus of this study, which is on single-modality classification, presents an innovative method of integrating deep learning-based multimodal fusion, thus setting a new standard for web classification. Experimental results show that the Logistic Sigmoid Long Short-Term Memory ((LS)²TM) model has an accuracy of 88.09%, sensitivity of 89.14%, and specificity of 89.01%, outperforming state-of-the-art techniques such as LSTM, Deep Belief Network DBN, and A-CNN. The model also enhanced its precision (93.15%), recall (92.84%), and F-measure (93.25%), which are generally 5% higher than classical control methods. These findings highlight the potential of (LS)²TM for improving web content mining through multimodal analysis. Future research should focus on the real-world validation and scalability of dynamic web environments.
Received: 19 January 2025 | Revised: 18 March 2025 | Accepted: 7 May 2025
Conflicts of Interest
The authors declare that they have no conflicts of interest to this work.
Data Availability Statement
The data that support the findings of this study are openly available at https://www.kaggle.com/datasets/shaurov/website classificationusing-url. The data that support the findings of this study are openly available at https://doi.org/10.1109/78.650093, reference number [12].
Author Contribution Statement
Manjunath Pujar: Methodology, Software, Validation, Resources, Writing – original draft. Monica Mundada: Conceptualization, Formal analysis, Writing – review & editing, Project administration. Sowmya B. J.: Methodology, Software, Validation, Investigation, Data curation. Supreeth Shivashankar: Writing – review & editing, Supervision, Project administration. Ganesh Dalappagari Ramanjinappa: Data curation, Visualization. Shambulingana Gouda: Visualization, Supervision.
Metrics
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.