Active and semi-supervised data domain description

Nico Görnitz; Marius Kloft; Ulf Brefeld

doi:10.1007/978-3-642-04180-8_44

Active and semi-supervised data domain description

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

Standard

Active and semi-supervised data domain description. / Görnitz, Nico; Kloft, Marius; Brefeld, Ulf.
Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part I. Hrsg. / Wray Buntine; Marko Grobelnik; Dunja Mladenic; John Shawe-Taylor. Berlin, Heidelberg: Springer Verlag, 2009. S. 407-422 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 5781 LNAI, Nr. PART 1).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

Harvard

Görnitz, N, Kloft, M & Brefeld, U 2009, Active and semi-supervised data domain description. in W Buntine, M Grobelnik, D Mladenic & J Shawe-Taylor (Hrsg.), Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part I. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Nr. PART 1, Bd. 5781 LNAI, Springer Verlag, Berlin, Heidelberg, S. 407-422, European Conference on Machine Learning and Knowledge Discovery in Databases - 2009, Bled, Slowenien, 07.09.09. https://doi.org/10.1007/978-3-642-04180-8_44

APA

Görnitz, N., Kloft, M., & Brefeld, U. (2009). Active and semi-supervised data domain description. In W. Buntine, M. Grobelnik, D. Mladenic, & J. Shawe-Taylor (Hrsg.), Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part I (S. 407-422). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Band 5781 LNAI, Nr. PART 1). Springer Verlag. https://doi.org/10.1007/978-3-642-04180-8_44

Vancouver

Görnitz N, Kloft M, Brefeld U. Active and semi-supervised data domain description. in Buntine W, Grobelnik M, Mladenic D, Shawe-Taylor J, Hrsg., Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2009, Bled, Slovenia, September 7-11, 2009, Proceedings, Part I. Berlin, Heidelberg: Springer Verlag. 2009. S. 407-422. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 1). doi: 10.1007/978-3-642-04180-8_44

Bibtex

@inbook{ca1a8bdc8dc34c3f8a7703bc2b43d1f7,

title = "Active and semi-supervised data domain description",

abstract = "Data domain description techniques aim at deriving concise descriptions of objects belonging to a category of interest. For instance, the support vector domain description (SVDD) learns a hypersphere enclosing the bulk of provided unlabeled data such that points lying outside of the ball are considered anomalous. However, relevant information such as expert and background knowledge remain unused in the unsupervised setting. In this paper, we rephrase data domain description as a semi-supervised learning task, that is, we propose a semi-supervised generalization of data domain description (SSSVDD) to process unlabeled and labeled examples. The corresponding optimization problem is non-convex. We translate it into an unconstraint, continuous problem that can be optimized accurately by gradient-based techniques. Furthermore, we devise an effective active learning strategy to query low-confidence observations. Our empirical evaluation on network intrusion detection and object recognition tasks shows that our SSSVDDs consistently outperform baseline methods in relevant learning settings.",

keywords = "Informatics, Active Learning, Background knowledge, Baseline methods, Continuous problems, Data domain description, Empirical evaluations, Gradient based, Learning settings, Network intrusion detection, Optimization problems, Semi-supervised learning, upport vector domain description, Unlabeled data, Business informatics",

author = "Nico G{\"o}rnitz and Marius Kloft and Ulf Brefeld",

year = "2009",

month = jul,

day = "1",

doi = "10.1007/978-3-642-04180-8_44",

language = "English",

isbn = "978-3-642-04179-2",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

number = "PART 1",

pages = "407--422",

editor = "Wray Buntine and Marko Grobelnik and Dunja Mladenic and John Shawe-Taylor",

booktitle = "Machine Learning and Knowledge Discovery in Databases",

address = "Germany",

note = "European Conference on Machine Learning and Knowledge Discovery in Databases - 2009, ECML-PKDD ; Conference date: 07-09-2009 Through 11-09-2009",

url = "https://www.k4all.org/event/european-conference-on-machine-learning-and-principles-and-practice-of-knowledge-discovery-in-databases/",

}

RIS

TY - CHAP

T1 - Active and semi-supervised data domain description

AU - Görnitz, Nico

AU - Kloft, Marius

AU - Brefeld, Ulf

PY - 2009/7/1

Y1 - 2009/7/1

N2 - Data domain description techniques aim at deriving concise descriptions of objects belonging to a category of interest. For instance, the support vector domain description (SVDD) learns a hypersphere enclosing the bulk of provided unlabeled data such that points lying outside of the ball are considered anomalous. However, relevant information such as expert and background knowledge remain unused in the unsupervised setting. In this paper, we rephrase data domain description as a semi-supervised learning task, that is, we propose a semi-supervised generalization of data domain description (SSSVDD) to process unlabeled and labeled examples. The corresponding optimization problem is non-convex. We translate it into an unconstraint, continuous problem that can be optimized accurately by gradient-based techniques. Furthermore, we devise an effective active learning strategy to query low-confidence observations. Our empirical evaluation on network intrusion detection and object recognition tasks shows that our SSSVDDs consistently outperform baseline methods in relevant learning settings.

AB - Data domain description techniques aim at deriving concise descriptions of objects belonging to a category of interest. For instance, the support vector domain description (SVDD) learns a hypersphere enclosing the bulk of provided unlabeled data such that points lying outside of the ball are considered anomalous. However, relevant information such as expert and background knowledge remain unused in the unsupervised setting. In this paper, we rephrase data domain description as a semi-supervised learning task, that is, we propose a semi-supervised generalization of data domain description (SSSVDD) to process unlabeled and labeled examples. The corresponding optimization problem is non-convex. We translate it into an unconstraint, continuous problem that can be optimized accurately by gradient-based techniques. Furthermore, we devise an effective active learning strategy to query low-confidence observations. Our empirical evaluation on network intrusion detection and object recognition tasks shows that our SSSVDDs consistently outperform baseline methods in relevant learning settings.

KW - Informatics

KW - Active Learning

KW - Background knowledge

KW - Baseline methods

KW - Continuous problems

KW - Data domain description

KW - Empirical evaluations

KW - Gradient based

KW - Learning settings

KW - Network intrusion detection

KW - Optimization problems

KW - Semi-supervised learning

KW - upport vector domain description

KW - Unlabeled data

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=70350627210&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-04180-8_44

DO - 10.1007/978-3-642-04180-8_44

M3 - Article in conference proceedings

AN - SCOPUS:70350627210

SN - 978-3-642-04179-2

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 407

EP - 422

BT - Machine Learning and Knowledge Discovery in Databases

A2 - Buntine, Wray

A2 - Grobelnik, Marko

A2 - Mladenic, Dunja

A2 - Shawe-Taylor, John

PB - Springer Verlag

CY - Berlin, Heidelberg

T2 - European Conference on Machine Learning and Knowledge Discovery in Databases - 2009

Y2 - 7 September 2009 through 11 September 2009

ER -

Weitere Publikationen dieser Person(en)

Interactive sequential generative models for team sports

Fassmeyer, D., Cordes, M. & Brefeld, U., 02.2025, in: Machine Learning. 114, 2, 15 S., 38.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Joint Item Response Models for Manual and Automatic Scores on Open-Ended Test Items

Bengs, D., Brefeld, U., Kroehne, U. & Zehner, F., 2025, (Angenommen/Im Druck) in: Psychometrika.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Machine Learning and Data Mining for Sports Analytics: 11th International Workshop, MLSA 2024, Vilnius, Lithuania, September 9, 2024, Revised Selected Papers

Brefeld, U. (Herausgeber*in), Davis, J. (Herausgeber*in), Van Haaren, J. (Herausgeber*in) & Zimmermann, A. (Herausgeber*in), 2025, Cham: Springer Verlag. 119 S. (Communications in Computer and Information Science; Band 2460)

Publikation: Bücher und Anthologien › Konferenzbände und -dokumentationen › Forschung

Masked autoencoder for multiagent trajectories

Rudolph, Y. & Brefeld, U., 02.2025, in: Machine Learning. 114, 2, 18 S., 44.

Publikation: Beiträge in Zeitschriften › Zeitschriftenaufsätze › Forschung › begutachtet

Self-improvement for Computerized Adaptive Testing

Rudolph, Y., Neubauer, K. & Brefeld, U., 2026, Machine Learning and Knowledge Discovery in Databases - Research Track: European Conference, ECML PKDD 2025, Porto, Portugal, September 15–19, 2025, Proceedings. Ribeiro, R. P., Jorge, A. M., Soares, C., Gama, J., Pfahringer, B., Japkowicz, N., Larrañaga, P. & Abreu, P. H. (Hrsg.). Cham: Springer International Publishing, Band 2. S. 70-86 17 S. (Lecture Notes in Computer Science; Band 16014 LNCS).

Publikation: Beiträge in Sammelwerken › Aufsätze in Konferenzbänden › Forschung › begutachtet

DOI

https://doi.org/10.1007/978-3-642-04180-8_44
Endgültige, publizierte Fassung