UPM Institutional Repository

Robust Random Regression Imputation method for missing data in the presence of outliers


Citation

John, Ahamefule Happy (2013) Robust Random Regression Imputation method for missing data in the presence of outliers. Masters thesis, Universiti Putra Malaysia.

Abstract

The Ordinary Least Square (OLS) estimator is the best regression estimator if all the assumptions are met. However, the presence of missing data and outliers can distort the Ordinary Least Squares estimation and increase the variability of the parameters estimates. The main focus of this research is to take remedial measure in missing data in regression in the presence of outliers. In regression analysis, the dependent variable (Y) is a function of the independent variable X. Thus, in regression, outliers and missing values can come in both X and Y directions. It is very common to use the OLS base Random Regression Imputation (RRI) when missing values are in Y direction. This RRI seems to be a good method if there are no outliers in the data. Unfortunately, this estimate performs poorly in the presence of outliers. It is because the RRI is OLS base imputation method and OLS is largely affected by outliers. As such, we modified an OLS base Random Regression Imputation (RRRI) methods by incorporating the robust MM estimate which is less affected by outliers. The proposed method is compared with some well-known methods of estimating missing data. The results of the study signify that the RRRI method outperforms the existing methods in the presence of outliers. Since in regression, outliers and missing data can come in both directions, we also considered a situation in which observations are missing in the X explanatory variable. In this respect, the Dummy Variable (DV) approach is one of the best approaches to predict the missing data model. However, this approach also becomes poor in the presence of outliers. As an alternative, Robust Inverse Regression Technique is proposed to get the better estimate. By examining the real data and Monte Carlo Simulation studies, it revealed that our proposed robust methods perform better than the classical methods.


Download File

[img]
Preview
PDF
FS 2013 42RR.pdf

Download (1MB) | Preview

Additional Metadata

Item Type: Thesis (Masters)
Subject: Sampling (Statistics)
Subject: Quantitative research
Subject: Regression analysis
Call Number: FS 2013 42
Chairman Supervisor: Md. Sohel Rana, PhD
Divisions: Faculty of Science
Depositing User: Haridan Mohd Jais
Date Deposited: 28 Nov 2016 08:05
Last Modified: 28 Nov 2016 08:05
URI: http://psasir.upm.edu.my/id/eprint/49818
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item