UPM Institutional Repository

Large-scale date palm tree segmentation from multiscale UAV-based and aerial images using deep vision transformers


Citation

Gibril, Mohamed Barakat A. and Mohd Shafri, Helmi Zulhaidi and Al-Ruzouq, Rami and Shanableh, Abdallah and Nahas, Faten and Al Mansoori, Saeed (2023) Large-scale date palm tree segmentation from multiscale UAV-based and aerial images using deep vision transformers. Drones, 7 (2). pp. 93-25. ISSN 2504-446X

Abstract

The reliable and efficient large-scale mapping of date palm trees from remotely sensed data is crucial for developing palm tree inventories, continuous monitoring, vulnerability assessments, environmental control, and long-term management. Given the increasing availability of UAV images with limited spectral information, the high intra-class variance of date palm trees, the variations in the spatial resolutions of the data, and the differences in image contexts and backgrounds, accurate mapping of date palm trees from very-high spatial resolution (VHSR) images can be challenging. This study aimed to investigate the reliability and the efficiency of various deep vision transformers in extracting date palm trees from multiscale and multisource VHSR images. Numerous vision transformers, including the Segformer, the Segmenter, the UperNet-Swin transformer, and the dense prediction transformer, with various levels of model complexity, were evaluated. The models were developed and evaluated using a set of comprehensive UAV-based and aerial images. The generalizability and the transferability of the deep vision transformers were evaluated and compared with various convolutional neural network-based (CNN) semantic segmentation models (including DeepLabV3+, PSPNet, FCN-ResNet-50, and DANet). The results of the examined deep vision transformers were generally comparable to several CNN-based models. The investigated deep vision transformers achieved satisfactory results in mapping date palm trees from the UAV images, with an mIoU ranging from 85% to 86.3% and an mF-score ranging from 91.62% to 92.44%. Among the evaluated models, the Segformer generated the highest segmentation results on the UAV-based and the multiscale testing datasets. The Segformer model, followed by the UperNet-Swin transformer, outperformed all of the evaluated CNN-based models in the multiscale testing dataset and in the additional unseen UAV testing dataset. In addition to delivering remarkable results in mapping date palm trees from versatile VHSR images, the Segformer model was among those with a small number of parameters and relatively low computing costs. Collectively, deep vision transformers could be used efficiently in developing and updating inventories of date palms and other tree species.


Download File

[img] Text
drones-07-00093-v2.pdf - Published Version

Download (9MB)
Official URL or Download Paper: https://www.mdpi.com/2504-446X/7/2/93

Additional Metadata

Item Type: Article
Divisions: Faculty of Engineering
DOI Number: https://doi.org/10.3390/drones7020093
Publisher: MDPI
Keywords: Vision transformer; Semantic segmentation; Tree crown delineation; Segformer; Swin transformer; Segmenter; Dense prediction transformer; CNN; Life on land
Depositing User: Mohamad Jefri Mohamed Fauzi
Date Deposited: 05 Sep 2024 07:05
Last Modified: 05 Sep 2024 07:05
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.3390/drones7020093
URI: http://psasir.upm.edu.my/id/eprint/110312
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item