Revisiting Supervised Contrastive Learning for Microblog Classification

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Microblog content (e.g., Tweets) is noisy due to its informal use of language and its lack of contextual information within each post. To tackle these challenges, state-of-the-art microblog classification models rely on pre-training language models (LMs). However, pre-training dedicated LMs is resource-intensive and not suitable for small labs. Supervised contrastive learning (SCL) has shown its effectiveness with small, available resources. In this work, we examine the effectiveness of fine-tuning transformer-based language models, regularized with a SCL loss for English microblog classification. Despite its simplicity, the evaluation on two English microblog classification benchmarks (TweetEval and Tweet Topic Classification) shows an improvement over baseline models. The result shows that, across all subtasks, our proposed method has a performance gain of up to 11.9 percentage points. All our models are open source.
OriginalspracheEnglisch
TitelThe 2024 Conference on Empirical Methods in Natural Language Processing : Proceedings of the Conference; November 12-16, 2024
HerausgeberYaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Anzahl der Seiten10
ErscheinungsortKerrville
VerlagAssociation for Computational Linguistics
Erscheinungsdatum2024
Seiten15644-15653
ISBN (elektronisch)979-8-89176-164-3
DOIs
PublikationsstatusErschienen - 2024
VeranstaltungConference on Empirical Methods in Natural Language Processing - EMNLP 2024 - Hyatt Regency Miami Hotel, Miami, USA / Vereinigte Staaten
Dauer: 12.11.202416.11.2024
Konferenznummer: 29
https://2024.emnlp.org/

DOI