Transformer-Based Approaches in Paraphrasing Texts

Mohamed Cherradi; Hajar El Mahajer

doi:10.47852/bonviewAIA62027294

Authors

Mohamed Cherradi Computer Science Department, Abdelmalek Essaâdi University (UAE), Morocco
Hajar El Mahajer Computer Science Department, Abdelmalek Essaâdi University (UAE), Morocco

DOI:

https://doi.org/10.47852/bonviewAIA62027294

Keywords:

paraphrase generation, large language models (LLMs), T5, PEGASUS, BART

Abstract

In a variety of natural language processing tasks, such as text generation, classification, sentiment analysis, and question answering, large language models like GPT have recently shown impressive capabilities. However, maintaining coherence and preserving semantic integrity make it difficult to generate high-quality paraphrases, especially for lengthy textual inputs. By utilizing three well-known and extensively used LLM architectures, including T5, Pretraining with Extracted Gap-sentences for Abstractive Summarization, and Bidirectional and Auto-Regressive Transformer, this study explores this problem by putting forth a strong multi-model framework for automatic paraphrasing. The main goal is to evaluate how well they can produce coherent and semantically correct paraphrases at the sentence and paragraph levels without requiring text segmentation. To assess output quality, the experimental setup uses common evaluation metrics, such as Recall-Oriented Understudy for Gisting Evaluation and Bilingual Evaluation Understudy scores. According to empirical findings, T5 consistently produces better results, especially in terms of semantic fidelity and linguistic fluency, even though all three models demonstrate strong paraphrasing abilities. These results highlight T5’s efficacy in complex paraphrasing tasks and provide insightful information for future research in data augmentation, summarization, and automatic content rewriting.

Received: 20 August 2025 | Revised: 6 March 2026 | Accepted: 1 April 2026

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

Data Availability Statement

The data that support the findings of this study are openly available in GitHub at https://github.com/cherradii/Paraphrasage/blob/main/datasets_paraphrases.csv.zip.

Author Contribution Statement

Mohamed Cherradi: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Writing – original draft, Writing – review & editing, Visualization, Supervision. Hajar El Mahajer: Conceptualization, Methodology, Software, Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization, Project administration.

Transformer-Based Approaches in Paraphrasing Texts

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Journal Information

cimago-journal

Make a Submission

Keywords