UPM Institutional Repository

Hybrid performance measures and mixed evaluation method for data classification problems


Citation

Hossin, Mohammad (2012) Hybrid performance measures and mixed evaluation method for data classification problems. PhD thesis, Universiti Putra Malaysia.

Abstract

This study investigates two different issues of performance measure in data classification problem. First, this study examines the use of accuracy measure as a discriminator for building an optimized Prototype Selection (PS) algorithm. Second, this study evaluates the current evaluation practices for evaluating and comparing the two performance measures. From the literature, the use of accuracy could lead to the underperforming of the evaluation process due to less distinctive and less discriminable values, and also unable to perform optimally when confronted with imbalanced class problem. Interestingly, the accuracy measure is still widely used in evaluating data classification problem. On the evaluation analysis, many previous studies emphasize on the generalization ability in evaluating and comparing the performance measures. Only few efforts have been dedicated to evaluate and compare the performance measures using different performance characteristics. In fact, no previous studies employ mixed evaluation method in evaluating and comparing the performance measures. For tackling the first issue, this study has successfully proposed several hybrid measures through the combination of accuracy with precision and recall measures. These hybrid measures are known as Optimized Accuracy with Conventional Recall-Precision (OACRP) and Optimized Accuracy with Extended Recall-Precision version 1 and version 2 (OAERP1 and OAERP2). More importantly, the OAERP1 and OAERP2 measure have been extended for evaluating multi-class problem. For the second issue, this study has proposed mixed evaluation method to evaluate the performance of two performance measures through different performance characteristics. For a systematic analysis, the mixed evaluation method is implemented into two stages. First, the hybrid measures are compared and analyzed against the accuracy measure based on their produced-values through different classification problems with different class distribution problems. Second, the hybrid measures are compared and analyzed empirically against the accuracy measure and other selected performance measures based on generalization ability using three selected PS algorithms (MCS, LVQ21 and GA) and large benchmark datasets. In the first evaluation stage, the OAERP2 measure has shown better produced-value against accuracy, OACRP and OAERP1 measures in terms of distinctiveness,discriminability, informativeness, favors towards minority class, and degree of consistency and discriminatory. In the second evaluation stage, almost all selected algorithms that optimized by OAERP2 measure are able to produce better generalization ability against its original measure and other selected performance measures. Moreover, the GA model that was optimized by OAERP2 measure (GAoe2) performed significantly and statistically differently as compared to other OAERP2-based models through win-draw-loss evaluation method and two nonparametric tests. Interestingly, the GAoe2 model also performed significantly and statistically differently as compared to nine additional PS algorithms in terms of testing error and storage requirements. From all evaluations, it clearly reveals that the OAERP2 measure is able to choose a better solution during the classification training. As a result, it leads towards a better trained PS classifier with better generalization ability. On the other hand, the mixed evaluation method has enabled this study to evaluate and compare the studied performance measures systematically and comprehensively via different performance characteristics.


Download File

[img]
Preview
PDF
FSKTM 2012 22R.pdf

Download (753kB) | Preview

Additional Metadata

Item Type: Thesis (PhD)
Subject: Computer algorithms
Subject: Machine learning
Call Number: FSKTM 2012 22
Chairman Supervisor: Associate Professor Dr. Md. Nasir Sulaiman, PhD
Divisions: Faculty of Computer Science and Information Technology
Depositing User: Haridan Mohd Jais
Date Deposited: 04 Mar 2015 08:35
Last Modified: 04 Mar 2015 08:35
URI: http://psasir.upm.edu.my/id/eprint/33140
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item