UPM Institutional Repository

Motionscope AI: comprehensive human activity recognition through integrated pose analysis and temporal modeling


Citation

Vijay, Harshil and Mathur, Shikar and Agarwal, Shashwat and Perumal, Thinagaran and Sharma, Abhishek (2026) Motionscope AI: comprehensive human activity recognition through integrated pose analysis and temporal modeling. IEEE Sensors Journal, 26 (7). pp. 10872-10882. ISSN 1530-437X; eISSN: 1558-1748

Abstract

This article introduces a practical deep learning framework for recognizing human activities in indoor environments using MediaPipe pose estimation with multibranch bidirectional LSTM (BiLSTM) architecture. We extract comprehensive features - including 3-D pose landmarks, hand gestures, velocity, acceleration, and joint angles - resulting in 685-D vectors per frame. Our multibranch design processes each feature type through specialized BiLSTM pathways with attention mechanisms, enabling the model to learn distinct spatial, temporal, and structural patterns. To ensure robust performance in real-world scenarios, we incorporate label smoothing, gradient clipping, and adaptive learning strategies. Evaluated on the IndoorActionDataset with eight activity classes, our approach achieves 95.6% test accuracy, significantly outperforming OpenPose-based methods (87.2%) and 3D-CNN with Transformer architectures (75.1%). With only 23 ms inference latency, the system demonstrates practical viability for real-time deployment on resource-constrained devices. The results confirm that thoughtful feature engineering combined with attention-driven temporal modeling can deliver both accuracy and efficiency for activity recognition tasks.


Download File

[img] Text
124766.pdf - Published Version
Restricted to Repository staff only

Download (1MB) | Request a copy
Official URL or Download Paper: https://ieeexplore.ieee.org/document/11391497/

Additional Metadata

Item Type: Article
Subject: Instrumentation
Subject: Electrical and Electronic Engineering
Divisions: Faculty of Computer Science and Information Technology
DOI Number: https://doi.org/10.1109/JSEN.2026.3658484
Publisher: Institute of Electrical and Electronics Engineers
Keywords: Artificial intelligence; Attention mechanism; Bidirectional lstm (bilstm); Computer vision; Indoor activity classification; Mediapipe; Multibranch neural architecture; Real-time activity recognition; Temporal feature extraction
Sustainable Development Goals (SDGs): SDG 9: Industry, Innovation and Infrastructure, SDG 11: Sustainable Cities and Communities, SDG 3: Good Health and Well-being
Depositing User: Ms. Siti Radziah Mohamed@mahmod
Date Deposited: 18 Jun 2026 04:27
Last Modified: 18 Jun 2026 04:27
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1109/JSEN.2026.3658484
URI: http://psasir.upm.edu.my/id/eprint/124766
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item