Metadata-Enhanced Hybrid Fusion Architecture: Commercial Fake Reviews Detection Model Using Transformer Embeddings
DOI:
https://doi.org/10.47852//bonviewFSI62028859Keywords:
hybrid fusion model, feature-driven intelligent system, commercial fake reviews detection, DistilBERT embeddings, sentiment mismatchAbstract
Commercial fake reviews have become an important issue for online businesses and e-commerce platforms, as they affect customer choices and decisions and present fake product quality. To enhance the identification of misleading commercial reviews, this study investigates a hybrid approach that combines machine learning and deep learning algorithms. Amazon fake reviews and Yelp reviews are two distinct datasets that have been utilized. Text cleaning, information extraction, sentiment analysis, and a special sentiment rating mismatch feature are all part of the comprehensive preparation pipeline used in this research study. While the DistilBERT technique is used to extract deeper contextual meaning from the text, traditional machine learning models, such as Random Forest, naive Bayes, logistic regression, and support vector machine, are trained using term frequency–inverse document frequency characteristics. Also, a hybrid fusion model has been developed by integrating DistilBERT embeddings with metadata variables, such as sentiment, rating, and text length. By obtaining 93% accuracy on Amazon and 91% on Yelp datasets, the study's results found that DistilBERT outperforms conventional models. The emotion rating mismatch technique also assists considerably by recognizing behavioral anomalies commonly found in fraudulent reviews. Overall, the research findings showed that combining semantic understanding with behavioral indications gives a more accurate and trustworthy architecture for detecting fraudulent commercial fake reviews in real-time online business contexts.
Received: 22 December 2025 | Revised: 9 February 2026 | Accepted: 21 April 2026
Conflicts of Interest
The authors declare that they have no conflicts of interest to this work.
Data Availability Statement
The data that support the findings of this study are openly available in Kaggle at https://www.kaggle.com/datasets/mexwell/fake-reviews-dataset and https://www.kaggle.com/datasets/abidmeeraj/yelp-labelled-dataset.
Author Contribution Statement
Hisham AbouGrad: Conceptualization, Validation, Resources, Writing – review & editing, Visualization, Supervision, Project administration. Fiza Riaz: Conceptualization, Validation, Software, Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Visualization.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.