UPM Institutional Repository

TB-FusionNet: a multi-scale feature fusion algorithm with spatial and channel cross-attention for tuberculosis detection


Citation

Ding, Zeyu and Yaakob, Razali and Tieng Wei, Koh and Azman, Azreen and Mohd Rum, Siti Nurulain and Zakaria, Nor Fadhlina and Ahmad Nazri, Azree Shahril (2026) TB-FusionNet: a multi-scale feature fusion algorithm with spatial and channel cross-attention for tuberculosis detection. IEEE Access, 14. pp. 4675-4687. ISSN 2169-3536

Abstract

The lesions of tuberculosis (TB) in X-ray images are highly complex, exhibiting a variety of sizes, shapes, and structural variations. Single-scale features are insufficient to fully represent this diversity and complexity, thereby limiting the effectiveness of TB detection. As a result, multi-scale feature fusion has become a widely explored approach in the field of TB detection. However, current multi-scale feature fusion methods still have several limitations. First, the weight allocation in existing methods typically remains at the feature level without extending to local features and channels. This limitation prevents the model from precisely controlling the significance of local features and channels, resulting in coarse feature representations. Second, current methods neglect the contextual information between features at different levels, which further undermines the consistency of the fused features and leads to a lack of semantic coherence. To address the aforementioned issues, this study proposes TB-FusionNet, a multi-scale feature fusion algorithm based on channel and spatial cross-attention mechanisms, for tuberculosis classification. The algorithm first calculates the similarity between local features at different levels to precisely select low-level detail features that are closely related to high-level semantic features, thereby generating a more hierarchical feature representation. Next, the algorithm computes the dependencies between feature channels at different levels, allowing low-level channel features to be appropriately enhanced or suppressed under the guidance of high-level channels, effectively improving the semantic consistency between cross-level features. Through these operations, the model can dynamically adjust the weight distribution of local features and channels according to task requirements, thereby more flexibly adapting to the challenges of complex tasks. Experiments were conducted on the Shenzhen, Montgomery and HSAAS datasets in this study. The results demonstrate that the proposed method outperforms current state-of-the-art approaches, validating its effectiveness and robustness.


Download File

[img] Text
123444.pdf - Published Version
Available under License Creative Commons Attribution.

Download (3MB)
Official URL or Download Paper: https://ieeexplore.ieee.org/document/11328080/

Additional Metadata

Item Type: Article
Subject: Computer Science (all)
Subject: Materials Science (all)
Divisions: Faculty of Computer Science and Information Technology
Faculty of Medicine and Health Science
DOI Number: https://doi.org/10.1109/ACCESS.2026.3650782
Publisher: Institute of Electrical and Electronics Engineers
Keywords: Deep learning; Multi-scale feature fusion; TB; Tuberculosis
Depositing User: MS. HADIZAH NORDIN
Date Deposited: 10 Mar 2026 02:05
Last Modified: 10 Mar 2026 02:05
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1109/ACCESS.2026.3650782
URI: http://psasir.upm.edu.my/id/eprint/123444
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item