Transformer-Based Approaches in Paraphrasing Texts
DOI:
https://doi.org/10.47852/bonviewAIA62027294Keywords:
paraphrase generation, large language models (LLMs), T5, PEGASUS, BARTAbstract
In a variety of natural language processing tasks, such as text generation, classification, sentiment analysis, and question answering, large language models like GPT have recently shown impressive capabilities. However, maintaining coherence and preserving semantic integrity make it difficult to generate high-quality paraphrases, especially for lengthy textual inputs. By utilizing three well-known and extensively used LLM architectures, including T5, Pretraining with Extracted Gap-sentences for Abstractive Summarization, and Bidirectional and Auto-Regressive Transformer, this study explores this problem by putting forth a strong multi-model framework for automatic paraphrasing. The main goal is to evaluate how well they can produce coherent and semantically correct paraphrases at the sentence and paragraph levels without requiring text segmentation. To assess output quality, the experimental setup uses common evaluation metrics, such as Recall-Oriented Understudy for Gisting Evaluation and Bilingual Evaluation Understudy scores. According to empirical findings, T5 consistently produces better results, especially in terms of semantic fidelity and linguistic fluency, even though all three models demonstrate strong paraphrasing abilities. These results highlight T5’s efficacy in complex paraphrasing tasks and provide insightful information for future research in data augmentation, summarization, and automatic content rewriting.
Received: 20 August 2025 | Revised: 6 March 2026 | Accepted: 1 April 2026
Conflicts of Interest
The authors declare that they have no conflicts of interest to this work.
Data Availability Statement
The data that support the findings of this study are openly available in GitHub at https://github.com/cherradii/Paraphrasage/blob/main/datasets_paraphrases.csv.zip.
Author Contribution Statement
Mohamed Cherradi: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Writing – original draft, Writing – review & editing, Visualization, Supervision. Hajar El Mahajer: Conceptualization, Methodology, Software, Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization, Project administration.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.