UPM Institutional Repository

Extended spatial decision tree algorithm for classifying hotspot occurrence


Citation

Sitanggang, Imas Sukaesih (2013) Extended spatial decision tree algorithm for classifying hotspot occurrence. PhD thesis, Universiti Putra Malaysia.

Abstract

Forest fire in Riau Province Indonesia is a yearly disaster especially in dry season. It caused many negative effects in various aspects of life for people in Indonesia and neighboring countries including Singapore and Malaysia. In order to minimize the negative effects because of forest fires, classifying hotspots (active fires) occurrence is essential as an activity in fires prevention. The existing methods to classify hotspots occurrence including the logistic regression and the decision tree algorithms do not include spatial objects in the forest fires dataset because these methods are designed for non-spatial dataset. On the other hand, supporting factors for hotspots occurrence are mostly represented in spatial objects. Therefore spatial objects should be included in forest fires datasets for classifying hotspots occurrence in order to obtain the classifiers with high accuracy. This work proposes a new spatial decision tree algorithm namely the extended spatial ID3 decision tree algorithm to classify hotspots occurrence from a forest fires dataset that contains point, line and polygon features. The method is an extension of the existing spatial decision tree algorithm which works on polygon features only. The proposed algorithm uses spatial information gain to choose the best splitting layer from a set of explanatory layers. The new formula for spatial information gain is proposed using spatial measures for point, line, and polygon features. The extended spatial ID3 algorithm has been applied to the real forest fires dataset consisting of ten explanatory layers (river, road, city center, land cover,source of income, precipitation in mm/day, screen temperature in K, 10m wind speed in m/s, peatland type, and peatland depth) and a target layer. The target layer consists of true alarm data (hotspots 2008) and false alarm data. The result is a spatial decision tree with 134 leaves with the accuracy 71.12%. After pruning, the spatial decision tree becomes smaller with 122 leaves and its accuracy is 71.66%. For comparison, classifiers for hotspots occurrence were also developed using the non-spatial methods namely the ID3 algorithm and the C4.5 algorithm as well as the logistic regression. The accuracy of decision tree generated by the ID3 and C4.5 algorithm is 49.02% and 65.24%, respectively. Meanwhile, the accuracy of the logistic regression model is 68.63%. Empirical results using the real spatial forest fires dataset demonstrate that the extended spatial ID3 algorithm has better performance in term of accuracy compared to the non-spatial methods. The spatial decision tree has been tested using the new dataset on forest fires containing hotspots 2010. The experimental results show that the accuracy of the tree without pruning is 60.06%. Meanwhile, the accuracy of the tree with pruning is 61.89%. The pruned trees do not able to classify about 4.24% objects in the new dataset. These unclassified objects mostly take place in non-peatland areas in which source of income of people living in these areas are forestry and agriculture. Moreover, most of unclassified objects are located in plantation and dryland forest.


Download File

[img]
Preview
PDF
FSKTM 2013 6R.pdf

Download (784kB) | Preview

Additional Metadata

Item Type: Thesis (PhD)
Subject: Algorithms - Data processing
Subject: Geographic information systems
Subject: Forest fires - Database
Call Number: FSKTM 2013 6
Chairman Supervisor: Razali Yaakob, PhD
Divisions: Faculty of Computer Science and Information Technology
Depositing User: Haridan Mohd Jais
Date Deposited: 14 Jan 2016 03:08
Last Modified: 14 Jan 2016 03:08
URI: http://psasir.upm.edu.my/id/eprint/38640
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item