A Comparative Study of Hybrid Text Summarization Techniques Using Traditional, Static, and Contextual Word Embeddings
DOI:
https://doi.org/10.47852/bonviewAIA62026330Keywords:
text summarization, hybrid summarization, contextual embeddings, word embeddings, natural language processingAbstract
Currently, there is an exponential growth in textual data due to the rapid expansion of internet technology and various social and entertainment platforms. This is causing widespread negative effects on document and text management, text classification, and information retrieval, among other areas. To address the challenge of translating large volumes of text into natural-sounding language that retains citations, quotes, and references, summarization research is gaining significant international attention. The focus is on developing an effective hybrid (extractive and abstractive) summarization system by implementing different word embedding and summarization techniques. The study examines various embedding methods, from traditional approaches (Count Vector, TF-IDF) to static word embeddings (Word2Vec, GloVe, FastText), and contextualized embeddings (ELMo, GPT, BERT). It employs efficient extractive methods (TF-IDF, Textrank, LSA, Word2Vec, ELMo with K-means) and abstractive methods (Bi-LSTM, GPT) to create an effective summarization system. By integrating eight summarization techniques into Django-based (Python) web applications, the research finds that GPT performs particularly well, followed by TextRank and Elmo-based systems, with their ROUGE scores compared. The study also discusses the challenges faced in developing such a system and outlines future directions for ongoing research.
Received: 1 June 2025 | Revised: 12 January 2026 | Accepted: 7 May 2026
Conflicts of Interest
The authors declare that they have no conflicts of interest to this work.
Data Availability Statement
The data that support the findings of this study are openly available in Kaggle at https://www.kaggle.com/datasets/gowrishankarp/newspaper-text-summarization-cnn-dailymail/data, reference number [56].
Author Contribution Statement
Lilakant Pokhrel: Conceptualization, Methodology, Software, Formal analysis, Investigation, Writing – original draft, Visualization. Sangita Pokhrel: Conceptualization, Methodology, Resources, Writing – review & editing, Supervision, Project administration. Swathi Ganesan: Validation, Resources, Supervision. Nalinda Somasiri: Validation, Resources, Data curation, Supervision.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.