Learning from partially annotated sequences

Eraldo R. Fernandes; Ulf Brefeld

doi:10.1007/978-3-642-23780-5_36

Learning from partially annotated sequences

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Authors

We study sequential prediction models in cases where only fragments of the sequences are annotated with the ground-truth. The task does not match the standard semi-supervised setting and is highly relevant in areas such as natural language processing, where completely labeled instances are expensive and require editorial data. We propose to generalize the semi-supervised setting and devise a simple transductive loss-augmented perceptron to learn from inexpensive partially annotated sequences that could for instance be provided by laymen, the wisdom of the crowd, or even automatically. Experiments on mono- and cross-lingual named entity recognition tasks with automatically generated partially annotated sentences from Wikipedia demonstrate the effectiveness of the proposed approach. Our results show that learning from partially labeled data is never worse than standard supervised and semi-supervised approaches trained on data with the same ratio of labeled and unlabeled tokens.

Original language	English
Title of host publication	Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2011, Proceedings
Editors	Dimitrios Gunopulos, Thomas Hofmann, Donato Malerba, Michalis Vazirgiannis
Number of pages	16
Place of Publication	Heidelberg, Berlin
Publisher	Springer Verlag
Publication date	2011
Edition	PART 1
Pages	407-422
ISBN (print)	978-3-642-23779-9
ISBN (electronic)	978-3-642-23780-5
DOIs	https://doi.org/10.1007/978-3-642-23780-5_36
Publication status	Published - 2011
Externally published	Yes
Event	European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - ECML PKDD 2011 - Athen, Greece Duration: 05.09.2011 → 09.09.2011 http://www.ecmlpkdd2011.org/ https://www.ecmlpkdd2011.org/

Research areas

Informatics - Automatically generated, Cross-lingual, Labeled data, Named entity recognition, NAtural language processing, Perceptron, Semi-supervised, Sequential prediction, Hide Markov Model, Unlabeled Data, Neural Information Processing System, Entity Recognition, Annotate Sequence
Business informatics

Other publications by the same author(s)

Interactive sequential generative models for team sports

Fassmeyer, D., Cordes, M. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 15 p., 38.

Research output: Journal contributions › Journal articles › Research › peer-review

Joint Item Response Models for Manual and Automatic Scores on Open-Ended Test Items

Bengs, D., Brefeld, U., Kroehne, U. & Zehner, F., 2025, (Accepted/In press) In: Psychometrika.

Research output: Journal contributions › Journal articles › Research › peer-review

Machine Learning and Data Mining for Sports Analytics: 11th International Workshop, MLSA 2024, Vilnius, Lithuania, September 9, 2024, Revised Selected Papers

Brefeld, U. (Editor), Davis, J. (Editor), Van Haaren, J. (Editor) & Zimmermann, A. (Editor), 2025, Cham: Springer Verlag. 119 p. (Communications in Computer and Information Science; vol. 2460)

Research output: Books and anthologies › Conference proceedings › Research

Masked autoencoder for multiagent trajectories

Rudolph, Y. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 18 p., 44.

Research output: Journal contributions › Journal articles › Research › peer-review

Self-improvement for Computerized Adaptive Testing

Rudolph, Y., Neubauer, K. & Brefeld, U., 2026, Machine Learning and Knowledge Discovery in Databases - Research Track: European Conference, ECML PKDD 2025, Porto, Portugal, September 15–19, 2025, Proceedings. Ribeiro, R. P., Jorge, A. M., Soares, C., Gama, J., Pfahringer, B., Japkowicz, N., Larrañaga, P. & Abreu, P. H. (eds.). Cham: Springer International Publishing, Vol. 2. p. 70-86 17 p. (Lecture Notes in Computer Science; vol. 16014 LNCS).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

DOI

https://doi.org/10.1007/978-3-642-23780-5_36
Final published version