A Comparative Study of Hybrid Text Summarization Techniques Using Traditional, Static, and Contextual Word Embeddings

Lilakant Pokhrel; Sangita Pokhrel; Swathi Ganesan; Nalinda Somasiri

doi:10.47852/bonviewAIA62026330

Authors

Lilakant Pokhrel Department of Computer Science and Data Science, York St John University, UK
Sangita Pokhrel Department of Computer Science and Data Science, York St John University, UK https://orcid.org/0009-0008-2092-7029
Swathi Ganesan Department of Computer Science and Data Science, York St John University, UK
Nalinda Somasiri Department of Computer Science and Data Science, York St John University, UK

DOI:

https://doi.org/10.47852/bonviewAIA62026330

Keywords:

text summarization, hybrid summarization, contextual embeddings, word embeddings, natural language processing

Abstract

Currently, there is an exponential growth in textual data due to the rapid expansion of internet technology and various social and entertainment platforms. This is causing widespread negative effects on document and text management, text classification, and information retrieval, among other areas. To address the challenge of translating large volumes of text into natural-sounding language that retains citations, quotes, and references, summarization research is gaining significant international attention. The focus is on developing an effective hybrid (extractive and abstractive) summarization system by implementing different word embedding and summarization techniques. The study examines various embedding methods, from traditional approaches (Count Vector, TF-IDF) to static word embeddings (Word2Vec, GloVe, FastText), and contextualized embeddings (ELMo, GPT, BERT). It employs efficient extractive methods (TF-IDF, Textrank, LSA, Word2Vec, ELMo with K-means) and abstractive methods (Bi-LSTM, GPT) to create an effective summarization system. By integrating eight summarization techniques into Django-based (Python) web applications, the research finds that GPT performs particularly well, followed by TextRank and Elmo-based systems, with their ROUGE scores compared. The study also discusses the challenges faced in developing such a system and outlines future directions for ongoing research.

Received: 1 June 2025 | Revised: 12 January 2026 | Accepted: 7 May 2026

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

Data Availability Statement

The data that support the findings of this study are openly available in Kaggle at https://www.kaggle.com/datasets/gowrishankarp/newspaper-text-summarization-cnn-dailymail/data, reference number [56].

Author Contribution Statement

Lilakant Pokhrel: Conceptualization, Methodology, Software, Formal analysis, Investigation, Writing – original draft, Visualization. Sangita Pokhrel: Conceptualization, Methodology, Resources, Writing – review & editing, Supervision, Project administration. Swathi Ganesan: Validation, Resources, Supervision. Nalinda Somasiri: Validation, Resources, Data curation, Supervision.

A Comparative Study of Hybrid Text Summarization Techniques Using Traditional, Static, and Contextual Word Embeddings

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Journal Information

cimago-journal

Make a Submission

Keywords