Citation
Abstract
Inner speech recognition using electroencephalogram (EEG) signals shows strong potential for developing assistive communication technologies. Existing methods often process spatial and temporal features separately, lack interpretability, and are usually tested on a single dataset, limiting their generalization. This study proposes a dual-branch deep learning framework that combines spatial features extracted through common spatial patterns (CSPs) with spectral-temporal features derived from multitaper spectrograms, using convolutional and long short-term memory networks. The model was evaluated on two public datasets, achieving classification accuracies of 89.99% and 92.47% in subject-dependent experiments. Subject-independent evaluation using leave-one-subject-out cross-validation yielded reduced accuracies of 26.20% and 20.47%, reflecting intersubject variability. Interpretability analyses using saliency maps, gradient-weighted class activation mapping, and feature contribution ratios highlighted physiologically meaningful patterns related to model decisions. The proposed method demonstrates strong performance and interpretability for subject-dependent inner speech recognition; while future work will focus on increasing data diversity and improving subject-independent generalization. This study contributes to the development of reliable and explainable EEG-based inner speech decoding for communication applications.
Download File
Official URL or Download Paper: https://ieeexplore.ieee.org/document/11543475/
|
Additional Metadata
| Item Type: | Article |
|---|---|
| Subject: | Human Factors and Ergonomics |
| Subject: | Control and Systems Engineering |
| Subject: | Signal Processing |
| Divisions: | Faculty of Computer Science and Information Technology Faculty of Engineering Faculty of Medicine and Health Science Institute for Mathematical Research |
| DOI Number: | https://doi.org/10.1109/THMS.2026.3690175 |
| Publisher: | Institute of Electrical and Electronics Engineers Inc. |
| Keywords: | Brain–computer interface (bci); Common spatial pattern (csp); Deep learning; Electroencephalogram; Fusion technique; Inner speech; Mental speech; Spectrogram features |
| Sustainable Development Goals (SDGs): | SDG 3: Good Health and Well-being, SDG 9: Industry, Innovation and Infrastructure, SDG 10: Reduced Inequalities |
| Depositing User: | Ms. Siti Radziah Mohamed@mahmod |
| Date Deposited: | 18 Jun 2026 07:37 |
| Last Modified: | 18 Jun 2026 07:37 |
| Altmetrics: | http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1109/THMS.2026.3690175 |
| URI: | http://psasir.upm.edu.my/id/eprint/126157 |
| Statistic Details: | View Download Statistic |
Actions (login required)
![]() |
View Item |
