Exploring Digital Tourism Through Topic Models: A Review and Experimental Study
DOI:
https://doi.org/10.47852/bonviewJDSIS62024472Keywords:
topic modeling, text-mining, digital tourism, comparative analysisAbstract
The surge in the volume and complexity of user-generated content (UGC) and data on digital tourism platforms has raise both opportunities and challenges for its automated analysis. Advanced topic modeling techniques are now necessitated to cater the variety, dynamism, and multifaceted nature of this data, yet their application in digital tourism encounters unique challenges. This study comprehensively reviews prominent and emerging topic models from the categories of probabilistic models, matrix factorization-based models, and neural embedding-based models, describing their systemic architectures and operational mechanisms. In the application context of digital tourism, the study follows an experimental evaluation of the models' performance on five datasets, across multiple coherence and diversity parameters. Results do not reveal optimality of a single model universally; rather, a model's effectiveness depends on size and structure characteristics of the data as extensively analyzed in this article. Additionally, the study presents quantitative and qualitative findings, implicit shortcomings along with conclusive deductions, digital tourism application related open issues of topic models, followed by future directions of research.
Received: 1 October 2024 | Revised: 27 March 2025 | Accepted: 10 September 2025
Conflicts of Interest
The authors declare that they have no conflicts of interest to this work.
Data Availability Statement
Data used in this study includes public and private datasets. The public datasets, Tourpedia and 20 Newsgroups, are available in Tourpedia datasets at http://tour-pedia.org/about/datasets.html and in scikit-learn real world datasets at https://scikit-learn.org/stable/datasets/real_world.html, respectively. Other datasets used in this study are not available to share publicly.
Author Contribution Statement
Maryam Kamal: Methodology, Software, Validation, Investigation, Data curation, Writing – original draft, Writing – review & editing, Visualization. Gianfranco Romani: Methodology, Software, Validation, Investigation, Data curation, Writing – original draft, Visualization. Giuseppe Ricciuti: Funding acquisition. Aris Anagnostopoulos: Conceptualization, Methodology. Ioannis Chatzigiannakis: Conceptualization, Methodology, Writing – original draft, Visualization, Supervision, Project administration.Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.