Toward supervised anomaly detection

Research output: Journal contributionsJournal articlesResearchpeer-review

Authors

Anomaly detection is being regarded as an unsupervised learning task as anomalies stem from adversarial or unlikely events with unknown distributions. However, the predictive performance of purely unsupervised anomaly detection often fails to match the required detection rates in many tasks and there exists a need for labeled data to guide the model generation. Our first contribution shows that classical semi-supervised approaches, originating from a supervised classifier, are inappropriate and hardly detect new and unknown anomalies. We argue that semi-supervised anomaly detection needs to ground on the unsupervised learning paradigm and devise a novel algorithm that meets this requirement. Although being intrinsically non-convex, we further show that the optimization problem has a convex equivalent under relatively mild assumptions. Additionally, we propose an active learning strategy to automatically filter candidates for labeling. In an empirical study on network intrusion detection data, we observe that the proposed learning methodology requires much less labeled data than the state-of-the-art, while achieving higher detection accuracies.

Original languageEnglish
JournalJournal of Artificial Intelligence Research
Volume46
Pages (from-to)235-262
Number of pages28
ISSN1076-9757
DOIs
Publication statusPublished - 20.02.2013
Externally publishedYes

    Research areas

  • Informatics - learning strategies, Detection accuracy, Empirical studies, Network intrusion detection, Optimization problems, redictive performance, Supervised classifiers, Unsupervised anomaly detection
  • Business informatics

DOI

Recently viewed

Researchers

  1. Peter Leonhard

Publications

  1. Organization
  2. Science and policy on endocrine disrupters must not be mixed
  3. Governmental activity, integration, and agglomeration
  4. Constitutions, Democratic Self-Determination and the Institutional Empowerment of Future Generations: Mitigating an Aporia
  5. Narrative dialogic reading with wordless picture books
  6. Resource use and competition between honey bees and wild bees in the Lüneburger Heath
  7. Corrosion behavior of As-Cast binary Mg-Dy alloys
  8. Use of Recurrence Quantification Analysis to Examine Associations Between Changes in Text Structure Across an Expressive Writing Intervention and Reductions in Distress Symptoms in Women With Breast Cancer
  9. Introduction
  10. Implementing inquiry-based science education to foster emotional engagement of special-needs students
  11. Organizing Colour
  12. Die Unternehmergesellschaft
  13. Analyse und Gestaltung von Fabriklebenszyklen
  14. Logistische Lageranalyse und Methodenvalidierung
  15. Between Usability and Trustworthiness-The Potential of Information Transfer Using Digital Information Platforms for Refugees
  16. National Parks, buffer zones and surrounding lands
  17. Grazing response patterns indicate isolation of semi-natural European grasslands
  18. Credit constraints and margins of import
  19. The constructs of sustainable supply chain management
  20. A Note on Smoking Behavior and Health Risk Taking
  21. Operaismo and the Wicked Problem of Organization
  22. Kindeswohl
  23. Efficacy of trapping techniques (pitfall, ramp and arboreal traps) for capturing spiders
  24. Use of force
  25. Geometrical Accuracy in Two-Stage Incremental Sheet Forming with Active Medium
  26. Effects of preactivated mental representations on driving performance
  27. Hot deformation behavior of Mg-2Sn-2Ca alloy in as-cast condition and after homogenization
  28. Impact of Auditor and Audit Firm Rotation on Accounting and Audit Quality
  29. When one size does not fit all
  30. Geochemical Assessment of Sediment Quality Using Multivariate Statistical Analysis of Ennore Creek, North of Chennai, SE Coast of India.
  31. Grüner Umbau
  32. Heat and light
  33. Theories of democratization