UPM Institutional Repository

A modified reweighted fast consistent and high-breakdown estimator for high-dimensional datasets


Citation

A. Baba, Ishaq and Midi, Habshah and June, Leong W. and Ibragimov, Gafurjan (2024) A modified reweighted fast consistent and high-breakdown estimator for high-dimensional datasets. Decision Analytics Journal, 10. art. no. 100424. pp. 1-11. ISSN 2772-6622

Abstract

Outlier detection and classification algorithms play a critical role in statistical analysis. The reweighted fast consistent and high breakdown point (RFCH) estimator is an outlier-resistant estimator of multivariate location and dispersion. Still, some difficulties hamper the application of the RFCH in high-dimensional settings. One main difficulty is that the RFCH cannot be applied when the dimension exceeds the sample size. We propose a modified reweighted fast consistent and high breakdown point (MRFCH) estimator to make it applicable to high-dimensional settings. The basic idea of our proposed method is to modify the Mahalanobis distance so that it uses only the diagonal elements of the scatter matrix in the computation of the RFCH algorithm. The proposed method preserves the robustness properties of the RFCH estimator. As a result, we achieve a robust and efficient high-dimensional procedure for computing location and scatter matrix estimates and a powerful outlier detection method. One of the main advantages of our proposed procedure over the existing RFCH is that it can be applied to both low and high-dimensional datasets. Based on the real-life datasets and simulation study, our proposed method showed promising results irrespective of sample size, dimensions, amount of contamination, computational time, and distance of the contamination. Thus, the new proposed algorithm can be applied to solve the problem of regression outliers in high-dimensional data (HDD) and serve as a better alternative to the minimum regularized covariance determinant (MRCD) estimator. © 2024 The Author(s)


Download File

[img] Text
1-s2.0-S2772662224000286-main.pdf - Published Version

Download (466kB)

Additional Metadata

Item Type: Article
Divisions: Institute for Mathematical Research
DOI Number: https://doi.org/10.1016/j.dajour.2024.100424
Publisher: Elsevier
Keywords: Covariance matrix; High-dimensional data; Mahalanobis distance; Outliers detection; Reweighted fast consistent and high breakdown point
Depositing User: Ms. Azian Edawati Zakaria
Date Deposited: 28 Oct 2024 02:34
Last Modified: 28 Oct 2024 02:34
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1016/j.dajour.2024.100424
URI: http://psasir.upm.edu.my/id/eprint/112070
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item