UPM Institutional Repository

Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi)


Citation

Midi, Habshah and Suhaiza, Jaaz and Mohd Aslam, . and Hani Syahida, . and Emi Amielda, . (2025) Improved robust principal component analysis based on minimum regularized covariance determinant for the detection of high leverage points in high dimensional data (penambahbaikan analisis komponen utama berdasarkan penentu kovarian teratur minimum bagi mengecam titik tuasan tinggi untuk data dimensi tinggi). Sains Malaysiana, 54 (8). pp. 2087-2097. ISSN 0126-6039; eISSN: 2735-0118

Abstract

This paper presents an extension work of robust principal component analysis (ROBPCA) denoted as IRPCA, to improve the accuracy of the detection of high leverage points (HLPs) in high dimensional data (HDD). The IRPCA employs the Principal Component Analysis (PCA) to reduce the dimension of the data set and subsequently a robust location and scatter estimates of the PC scores are obtained based on the Minimum Regularized Covariance Determinant (MRCD). Instead of using robust score distance to detect HLPs as in ROBPCA; in the proposed IRPCA, we have considered using Robust Mahalanobis distance (RMD). The performance of the IRPCA is compared to the ROBPCA and the Minimum Regularized Covariance Determinant and PCA-based method (MRCD-PCA) for the identification of HLPs in HDD. The results signify that all the three methods are very successful in the detection of HLPs with no masking effect. Nonetheless, the ROBPCA suffers from serious swamping problems for less than 30% of HLPs. The proposed IRPCA and the MRCD-PCA have similar performance, having very small swamping effect. However, the MRCD-PCA algorithm is quite cumbersome and required longer computational running time. The attractive feature of the IRPCA is that it provides a simpler algorithm and it is very fast.


Download File

[img] Text
120869.pdf - Published Version

Download (581kB)

Additional Metadata

Item Type: Article
Divisions: Faculty of Science
Institute for Mathematical Research
DOI Number: https://doi.org/10.17576/jsm-2025-5408-17
Publisher: Penerbit Universiti Kebangsaan Malaysia
Keywords: High Leverage Point; Minimum regularized covariance determinant; Principal component analysis; Robust mahalanobis distance
Depositing User: MS. HADIZAH NORDIN
Date Deposited: 14 Oct 2025 04:09
Last Modified: 14 Oct 2025 04:09
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.17576/jsm-2025-5408-17
URI: http://psasir.upm.edu.my/id/eprint/120869
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item