Toward supervised anomaly detection
Research output: Journal contributions › Journal articles › Research › peer-review
Standard
In: Journal of Artificial Intelligence Research, Vol. 46, 20.02.2013, p. 235-262.
Research output: Journal contributions › Journal articles › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - JOUR
T1 - Toward supervised anomaly detection
AU - Görnitz, Nico
AU - Kloft, Marius
AU - Rieck, Konrad
AU - Brefeld, Ulf
PY - 2013/2/20
Y1 - 2013/2/20
N2 - Anomaly detection is being regarded as an unsupervised learning task as anomalies stem from adversarial or unlikely events with unknown distributions. However, the predictive performance of purely unsupervised anomaly detection often fails to match the required detection rates in many tasks and there exists a need for labeled data to guide the model generation. Our first contribution shows that classical semi-supervised approaches, originating from a supervised classifier, are inappropriate and hardly detect new and unknown anomalies. We argue that semi-supervised anomaly detection needs to ground on the unsupervised learning paradigm and devise a novel algorithm that meets this requirement. Although being intrinsically non-convex, we further show that the optimization problem has a convex equivalent under relatively mild assumptions. Additionally, we propose an active learning strategy to automatically filter candidates for labeling. In an empirical study on network intrusion detection data, we observe that the proposed learning methodology requires much less labeled data than the state-of-the-art, while achieving higher detection accuracies.
AB - Anomaly detection is being regarded as an unsupervised learning task as anomalies stem from adversarial or unlikely events with unknown distributions. However, the predictive performance of purely unsupervised anomaly detection often fails to match the required detection rates in many tasks and there exists a need for labeled data to guide the model generation. Our first contribution shows that classical semi-supervised approaches, originating from a supervised classifier, are inappropriate and hardly detect new and unknown anomalies. We argue that semi-supervised anomaly detection needs to ground on the unsupervised learning paradigm and devise a novel algorithm that meets this requirement. Although being intrinsically non-convex, we further show that the optimization problem has a convex equivalent under relatively mild assumptions. Additionally, we propose an active learning strategy to automatically filter candidates for labeling. In an empirical study on network intrusion detection data, we observe that the proposed learning methodology requires much less labeled data than the state-of-the-art, while achieving higher detection accuracies.
KW - Informatics
KW - learning strategies
KW - Detection accuracy
KW - Empirical studies
KW - Network intrusion detection
KW - Optimization problems
KW - redictive performance
KW - Supervised classifiers
KW - Unsupervised anomaly detection
KW - Business informatics
UR - http://www.scopus.com/inward/record.url?scp=84875512265&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/5182cca1-961d-3f09-a144-35dd9cc37f97/
U2 - 10.1613/jair.3623
DO - 10.1613/jair.3623
M3 - Journal articles
AN - SCOPUS:84875512265
VL - 46
SP - 235
EP - 262
JO - Journal of Artificial Intelligence Research
JF - Journal of Artificial Intelligence Research
SN - 1076-9757
ER -