A Multi-Part Attention-Guided Spatial-Temporal GCN Framework for Gait-Based Person Recognition

Md. Khaliluzzaman; Kaushik Deb

doi:10.47852/bonviewAIA62028402

Authors

Md. Khaliluzzaman Department of Computer Science and Engineering, Chittagong University of Engineering & Technology and Department of Computer Science and Engineering, International Islamic University Chittagong, Bangladesh https://orcid.org/0000-0001-6846-1610
Kaushik Deb Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, Bangladesh

DOI:

https://doi.org/10.47852/bonviewAIA62028402

Keywords:

gait recognition, Spatial-Temporal Graph Convolutional Networks (ST-GCNs), multi-part attention-guided, CASIA-B, OUMVLP-Pose

Abstract

Gait recognition has appeared as an important biometric modality because of its nonintrusive nature and straightforward implementation, enabling identification without physical contact. In contrast to systems that rely on silhouette information and other visual attributes, skeleton-based approaches retrieve gait data independently of appearance indicators. However, most approaches in the field utilize manually extracted features and adjacency matrices based on joint physical connectivity. This dependence poses a significant challenge for acquiring semantically rich representations of joint interactions and fundamental motion patterns, which are crucial for real-world gait understanding. To focus on these issues, this paper introduces a skeleton-based multi-part attention-guided spatial–temporal graph convolutional network (ST-GCN) gait recognition approach, MAST-GCN, which enhances the modeling of spatial and temporal dependencies in skeletal data through a multi-part attention mechanism. Compared with ST-GCNs, which rely on rigid graph structures and struggle to capture long-range interactions essential for identifying subtle gait differences, our method divides the skeleton into distinct anatomical regions and applies a part-wise attention module. By integrating attention-weighted features through a hierarchical fusion process, the model effectively captures both detailed and broad gait patterns across multiple temporal scales. The framework’s effectiveness has been verified on benchmark datasets such as the CASIA-B and OUMVLP-Pose, achieving rank-1 precisions of 95.8%, 91.8%, and 88.5% under normal walking (NM), carrying bag (BG), and wearing coat (CL) conditions, respectively, on the CASIA-B dataset and 93.0% on the OUMVLP-Pose dataset, showing superior performance. Our approach outperforms state-of-the-art methods, particularly highlighting the benefits of part-based, attention-driven feature extraction for robust, precise gait recognition.

Received: 25 November 2025 | Revised: 12 March 2026 | Accepted: 20 May 2026

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

Data Availability Statement

The data that support the findings of this study are openly available in CASIA-B at http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp and in OU-MVLP-Pose at http://www.am.sanken.osaka-u.ac.jp/BiometricDB/GaitMVLP.html.

Author Contribution Statement

Md. Khaliluzzaman: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing – original draft, Writing – review & editing. Kaushik Deb: Validation, Resources, Writing – review & editing, Supervision, Project administration.

A Multi-Part Attention-Guided Spatial-Temporal GCN Framework for Gait-Based Person Recognition

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Journal Information

cimago-journal

Make a Submission

Keywords