UPM Institutional Repository

An improved machine learning model of massive Floating Car Data (FCD) based on Fuzzy-MDL and LSTM-C for traffic speed estimation and prediction


Citation

Ahanin, Fatemeh (2023) An improved machine learning model of massive Floating Car Data (FCD) based on Fuzzy-MDL and LSTM-C for traffic speed estimation and prediction. Doctoral thesis, Universiti Putra Malaysia.

Abstract

In today’s world, traffic congestion is a major problem in almost all metropolitans. There has been much previous research developing new methods to improve accuracy of Traffic State Prediction (TSP) which are designed according to its advantage for static sensors such as video cameras, inductive loop detectors and other static sensors. However, static sensors are not able to store longer traffic flow patterns and capture the dynamics of traffic flow and their instalment is too expensive. Floating Car Data (FCD) is a convenient and cost-effective method to gather traffic condition information. It is regarded as GPS sensors which can probe a large scale of traffic flows in real time. Although FCD can cover more road segments across the road network compared to static sensors, GPS data are prone to missing data because of urban canyons and tall buildings that will affect the traffic prediction accuracy. Existence of missing data (known as data sparsity) have made the traffic prediction tasks even more sophisticated. There are two techniques used by the existing methods of TSP which are either with Traffic State Estimation (Traffic State Estimation) or without TSE. While TSE estimates the missing data in traffic states, such as speed and density to reduce data sparsity, TSP uses the traffic data to forecast the traffic state within a certain time period in future. When there are missing data in the dataset, TSP may use TSE for estimation of missing data and then performs prediction. The aim of this thesis is to improve accuracy of TSP with TSE as well as without TSE by the improvement of LSTM. There are three (3) methods are proposed in this study. In the first method, a new algorithm called LSTM-C (Long Short-Term Memory (LSTM) with Contrast) is proposed to improve prediction of traffic speed without TSE. The existing research in traffic speed prediction used LSTM with single variable (traffic speed) and multi variables (traffic speed and vehicle headway). However, multivariate LSTM does not add any significant contribution to adequately predict traffic speed compared to single variate LSTM. This signifies that LSTM model requires improvement in term of identification of traffic speed changes within a certain time period. Thus, this study improved the traffic speed prediction using LSTM with Contrast Measure which detects the decreasing and increasing patterns in traffic speed. Speed prediction accuracy of the proposed method LSTM-C and previous work LSTM achieved 96.67% and 94.86 respectively. In the second method, a new traffic estimation method is proposed using Fuzzy C-Mean (FCM) clustering and Minimum Description Length (MDL). MDL uses patterns to express the repeated presence in the data of particular items or clusters. Spectral clustering and Hidden Markov Model (HMM) has been used in detecting patterns by the existing research to estimate traffic speed. HMMs are well-suited for capturing first-order dependencies, also known as Markov dependencies. In an HMM, the future state (or observation) depends only on the current state and is independent of the past states. This behaviour of HMM makes it less effective in estimation of traffic data, because it might be necessary to consider several previous states when estimating a missing state. This thesis uses Fuzzy C-Mean and concept of MDL to constitute patterns and estimate the missing traffic state based on n previous states. The implementation results demonstrate proposed Fuzzy-MDL method has achieved accuracy of 96.46% which outperform the HMM-based model that achieved 93.14%. In the third method, a hybrid algorithm called LSTM-C-EST, which is a combination of Fuzzy-MDL and LSTM-C is proposed. The idea to propose this method is that estimating the value of missing traffic speed can improve the traffic prediction results. In this model, the Fuzzy-MDL is applied as the pre-processing step to estimate the missing traffic speeds. Then this new estimated data is used for prediction to predict the traffic speed in the next 5 minutes. The results of this model is compared with LSTM-C as well as the study by (shuming Sun et al., (2019) which performed traffic estimation and traffic prediction using HMM and a single variant LSTM. The accuracy of LSTM-C-EST, LSTM-C, LSTM are 98.05%, 96.69%, 94.90% respectively, which proves the LSTM-C-EST outperform the other two algorithms.


Download File

[img] Text
119704.pdf

Download (1MB)
Official URL or Download Paper: http://ethesis.upm.edu.my/id/eprint/18473

Additional Metadata

Item Type: Thesis (Doctoral)
Subject: Traffic forecasting
Subject: Machine learning
Subject: Fuzzy systems
Call Number: FSKTM 2023 5
Chairman Supervisor: Associate Professor Norwati Mustapha, PhD
Divisions: Faculty of Computer Science and Information Technology
Depositing User: Ms. Rohana Alias
Date Deposited: 09 Oct 2025 02:25
Last Modified: 09 Oct 2025 02:25
URI: http://psasir.upm.edu.my/id/eprint/119704
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item