Citation
Mohamed Yassin, Warusia
(2011)
An improved hybrid learning approach for better anomaly detection.
Masters thesis, Universiti Putra Malaysia.
Abstract
Intrusion Detection System (IDS) is facing complex requirements to overcome modern attack activities from damaging the computer systems. Gaining unauthorized access to files, attempting to damage the network and data, and any other serious security threat must be prevented by the Intrusion Detection System. Anomaly detection is one of intrusion detection techniques. This technique identifies an activity which deviates from the normal behaviours. Nonetheless, current anomaly detection techniques are unable to detect all types of attacks accurately and correctly. Therefore, anomaly detection is often associated with high false alarm with only moderate accuracy of detection rates.
In recent years, data mining approach for intrusion detection have been proposed and used such as neural networks, clustering, genetic algorithms, decision trees, and support vector machines. These approaches have resulted in high accuracy and good detection rates but with moderate false alarm on novel attacks. The recent works has been proposed by Tsai et al. (2010) called a Triangle Area Based Nearest Neighbor (TANN) to obtain high accuracy and detection rate with low false alarms. Unfortunately this approach has not shown a remarkable improvement. In addition, some attacks and normal connections are even failed to be detected correctly. Therefore, there is a need for an approach that could detect and identify such attacks accurately in an interconnected network.
In this thesis, an improved hybrid mining approach is proposed through combination of K-Means clustering and classification techniques. K-Means clustering is an anomaly detection technique that is naturally capable for dealing with huge data in high speed network. K-Means clustering divides data into corresponding group called clusters, whereby all data in the same cluster are similar to each other. The proposed hybrid approach will be clustering all data into the corresponding group before applying a classifier for classification purposes. We choose k=3 in order to cluster data into three clusters called C1, C2 and C3. Probe, U2R and R2L attack data grouped into C1, while C2 is used to group DoS attack data. In order to separates normal data from an attack, C3 is used. Next, a number of classifiers like Naïve Bayes, OneR, and Random Forest separately applied to these data to group all data into the right categories.
An experiment is carried out to evaluate the performance of the proposed approach and the current techniques in terms of accuracy, detection rate, and false alarm rate using Knowledge Discovery in Databases (KDD) called KDD Cup ‟99 intrusion detection dataset. The data covers four types of main attacks, which are Denial-of-Services (DoS), User to Root (U2R), Remote to Local (R2L), and Probe. Results show that the proposed approach performed better in term of accuracy, detection rates, and able to significantly reduce the false alarm rates.
Download File
Additional Metadata
Actions (login required)
|
View Item |