Active and semi-supervised data domain description

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Data domain description techniques aim at deriving concise descriptions of objects belonging to a category of interest. For instance, the support vector domain description (SVDD) learns a hypersphere enclosing the bulk of provided unlabeled data such that points lying outside of the ball are considered anomalous. However, relevant information such as expert and background knowledge remain unused in the unsupervised setting. In this paper, we rephrase data domain description as a semi-supervised learning task, that is, we propose a semi-supervised generalization of data domain description (SSSVDD) to process unlabeled and labeled examples. The corresponding optimization problem is non-convex. We translate it into an unconstraint, continuous problem that can be optimized accurately by gradient-based techniques. Furthermore, we devise an effective active learning strategy to query low-confidence observations. Our empirical evaluation on network intrusion detection and object recognition tasks shows that our SSSVDDs consistently outperform baseline methods in relevant learning settings.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases : European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part I
EditorsWray Buntine, Marko Grobelnik, Dunja Mladenic, John Shawe-Taylor
Number of pages16
Place of PublicationBerlin, Heidelberg
PublisherSpringer Verlag
Publication date01.07.2009
Pages407-422
ISBN (print)978-3-642-04179-2
ISBN (electronic)978-3-642-04180-8
DOIs
Publication statusPublished - 01.07.2009
Externally publishedYes
EventEuropean Conference on Machine Learning and Knowledge Discovery in Databases - 2009 - Bled, Slovenia
Duration: 07.09.200911.09.2009
https://www.k4all.org/event/european-conference-on-machine-learning-and-principles-and-practice-of-knowledge-discovery-in-databases/

    Research areas

  • Informatics - Active Learning, Background knowledge, Baseline methods, Continuous problems, Data domain description, Empirical evaluations, Gradient based, Learning settings, Network intrusion detection, Optimization problems, Semi-supervised learning, upport vector domain description, Unlabeled data
  • Business informatics

Recently viewed

Researchers

  1. Isabel Albrecht

Publications

  1. Modelling and implementation of an Order2Cash Process in distributed systems
  2. Preventive Diagnostics for cardiovascular diseases based on probabilistic methods and description logic
  3. Top-down contingent attentional capture during feed-forward visual processing
  4. Robust approximate fixed-time tracking control for uncertain robot manipulators
  5. Other spaces
  6. Distributable Modular Software Framework for Manufacturing Systems
  7. The Open Anchoring Quest Dataset: Anchored Estimates from 96 Studies on Anchoring Effects
  8. Developing a Complex Portrait of Content Teaching for Multilingual Learners via Nonlinear Theoretical Understandings
  9. Learning and Re-learning from net- based cooperative learning discourses
  10. Combining flatness based feedforward action with a fractional PI regulator to control the intake valve engine
  11. Learning in the "Third Space"
  12. Reciprocal Relationships Between Dispositional Optimism and Work Experiences
  13. Robustness of coherent sets computations
  14. Generative 3D reconstruction of Ti-6Al-4V basketweave microstructures by optimization of differentiable microstructural descriptors
  15. Intraspecific trait variation increases species diversity in a trait-based grassland model
  16. Group formation in computer-supported collaborative learning
  17. Tree species and genetic diversity increase productivity via functional diversity and trophic feedbacks
  18. Differences in the sophistication of Value-based Management
  19. The Augmented Theorist - Toward Automated Knowledge Extraction from Conceptual Models
  20. Global fern and lycophyte richness explained: How regional and local factors shape plot richness
  21. Function, flexibility, and responsibility
  22. Enterprise Architecture Management Support for Digital Transformation Projects in Very Large Enterprises
  23. Visual-Inertial Navigation Systems and Technologies
  24. Erroneous examples as desirable difficulty
  25. Theme zones in English media discourse
  26. Error handling in office work with computers
  27. Intraindividual variability in identity centrality
  28. Analyzing Talk and Text II: Thematic Analysis
  29. I share because of who I am: values, identities, norms, and attitudes explain sharing intentions
  30. Data quality assessment framework for critical raw materials. The case of cobalt
  31. Analysing clickstream data
  32. Managing (in) times of uncertainty
  33. Introduction: The representative turn in EU Studies
  34. The 1986 Principles Relating to Remote Sensing of the Earth from Outer Space (RS Princi­ples)
  35. Fluorometer controlled apparatus designed for long-duration algal-feeding experiments and environmental effect studies with mussels
  36. Leverage points and levers of inclusive conservation in protected areas
  37. Sprachliche Muster
  38. Development and characterisation of a new interface for coupling capillary LC with collision-cell ICPMS and its application for phosphorylation profiling of tryptic protein digests
  39. Empowered or informed? Seeking to mitigate gender differences in first-offer assertiveness through pre-negotiation interventions
  40. Citizen relationship management