ResProtoNet: A Skeleton-Aware Few-Shot Framework for Yoga Pose Classification

Authors

  • Chean Khim Toa School of Computing and Data Science, Xiamen University Malaysia, Malaysia https://orcid.org/0000-0003-0879-4848
  • Kai Liang Lew Faculty of Engineering & Technology, Multimedia University, Malaysia https://orcid.org/0000-0002-0376-2970
  • Sin Pei Ton School of Computing and Data Science, Xiamen University Malaysia, Malaysia

DOI:

https://doi.org/10.47852/bonviewAIA62027514

Keywords:

few-shot learning, human pose estimation, RGB input, skeleton-based input, yoga pose classification

Abstract

Deep learning models for yoga pose classification traditionally require large, diverse, carefully annotated datasets, which are costly and time-consuming. While open-source yoga datasets are available, collecting and annotating new ones, especially for complex poses, remains a significant challenge. This limitation motivated the use of few-shot learning (FSL) to enable efficient pose classification under limited data conditions. This study proposes ResProtoNet, combining a ResNet-18 feature extractor with a Prototypical Network classifier, evaluated against a supervised baseline using RGB and skeleton-based images. The latter contains joint-based pose encodings extracted via the MediaPipe model. Experiments conducted on five fundamental yoga poses under 1-shot, 3-shot, and 5-shot configurations demonstrated several findings. First, ResProtoNet consistently outperformed the ResNet-18 baseline, achieving 98.2% accuracy in the 3-shot setting, compared to 97.6%. Second, ResNet-18 consistently delivered the strongest baseline performance across both modalities, underscoring its robustness and justifying its role as backbone. A key contribution is a comprehensive stress test comparing RGB and skeleton modalities within an FSL context. Robustness analysis highlighted that skeleton-based input preserved up to 20% higher accuracy under heavy occlusion and reduced accuracy loss by 5.7% under resolution degradation, confirming resilience when visual information degrades. In contrast, RGB retained advantages under light occlusion and higher-resolution inputs, where texture and background remained informative. Overall, applying FSL to a supervised baseline enables reliable pose classification under limited supervision, substantially reducing dependence on large-scale annotated datasets. ResProtoNet demonstrated strong potential for real-world healthcare and exercise monitoring systems, where annotated resources and data quality are constrained.

 

Received: 31 August 2025 | Revised: 8 January 2026 | Accepted: 22 April 2026

 

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

 

Data Availability Statement

The data that support the findings of this study are openly available in Yoga Poses Dataset [Kaggle] at https://www.kaggle.com/datasets/niharika41298/yoga-poses-dataset, in EasyFSL (PyTorch library for few-shot learning) [GitHub] at https://github.com/sicara/easy-few-shot-learning, in Yoga-82 Dataset [Kaggle] at https://www.kaggle.com/datasets/akashrayhan/yoga-82, and in Blazepose_skeletons_Yoga_82 Dataset [Kaggle] at https://www.kaggle.com/datasets/rashiniyasp/blazepose-skeletons-yoga-82.

 

Author Contribution Statement

Chean Khim Toa: Conceptualization, Methodology, Validation, Resources, Writing – original draft, Writing – review & editing, Visualization, Supervision, Project administration, Funding acquisition. Kai Liang Lew: Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing – review & editing, Supervision, Project administration. Sin Pei Ton: Conceptualization, Formal analysis, Resources.


Downloads

Published

2026-05-08

Issue

Section

Research Article

How to Cite

Toa, C. K., Lew, K. L., & Ton, S. P. (2026). ResProtoNet: A Skeleton-Aware Few-Shot Framework for Yoga Pose Classification. Artificial Intelligence and Applications. https://doi.org/10.47852/bonviewAIA62027514

Funding data