RECURSIVE ANTIHUB2 OUTLIER DETECTION IN HIGH DIMENSIONAL DATA

Author's Name: J.Michael Antony Sylvia & Dr.T.C.Rajakumar
Subject Area: Science and Engineering
Subject Computer Science
Section Research Paper

Keyword:

kNN, AntiHub, AntiHub2, Recursive AntiHub2


Abstract

Unsupervised outlier detection is done in a raw data collected from system. To identify unsupervised anomalies in high dimensional data is more complex. Therefore, the main objective of this thesis is to propose the unsupervised anomaly detection in high dimensional data. Anomaly detection in high dimensional data exhibits that as dimensionality increases there exists hubs and antihubs. Hubs are points that frequently occur in k nearest neighbor lists. Antihubs are points that infrequently occur in kNN lists. Outlier detection using AntiHub method is reformulated as Antihub2 to refine the outlier scores of a point produced by the AntiHub method by considering Nk scores of the neighbors of x in addition to Nk(x) itself. Discrimination of outlier scores produced by Antihub2 acquires longer period of time with larger number of iterations. Therefore Recursive AntiHub2 method was introduced to improve the computational complexity of discriminating the outlier scores with reduced number of iterations to detect the more prominent outlier in high dimensional data.

Download Full Paper