Revisiting Supervised Contrastive Learning for Microblog Classification

Junbo Huang; Ricardo Usbeck

doi:10.18653/v1/2024.emnlp-main.876

Revisiting Supervised Contrastive Learning for Microblog Classification

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Standard

Revisiting Supervised Contrastive Learning for Microblog Classification. / Huang, Junbo; Usbeck, Ricardo.
The 2024 Conference on Empirical Methods in Natural Language Processing: Proceedings of the Conference; November 12-16, 2024. ed. / Yaser Al-Onaizan; Mohit Bansal; Yun-Nung Chen. Kerrville: Association for Computational Linguistics, 2024. p. 15644-15653.

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Harvard

Huang, J & Usbeck, R 2024, Revisiting Supervised Contrastive Learning for Microblog Classification. in Y Al-Onaizan, M Bansal & Y-N Chen (eds), The 2024 Conference on Empirical Methods in Natural Language Processing: Proceedings of the Conference; November 12-16, 2024. Association for Computational Linguistics, Kerrville, pp. 15644-15653, Conference on Empirical Methods in Natural Language Processing - EMNLP 2024, Miami, Florida, United States, 12.11.24. https://doi.org/10.18653/v1/2024.emnlp-main.876

APA

Huang, J., & Usbeck, R. (2024). Revisiting Supervised Contrastive Learning for Microblog Classification. In Y. Al-Onaizan, M. Bansal, & Y.-N. Chen (Eds.), The 2024 Conference on Empirical Methods in Natural Language Processing: Proceedings of the Conference; November 12-16, 2024 (pp. 15644-15653). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.emnlp-main.876

Vancouver

Huang J, Usbeck R. Revisiting Supervised Contrastive Learning for Microblog Classification. In Al-Onaizan Y, Bansal M, Chen YN, editors, The 2024 Conference on Empirical Methods in Natural Language Processing: Proceedings of the Conference; November 12-16, 2024. Kerrville: Association for Computational Linguistics. 2024. p. 15644-15653 doi: 10.18653/v1/2024.emnlp-main.876

Bibtex

@inbook{6e0b577dd9a84fdab4ee6979e6293c3b,

title = "Revisiting Supervised Contrastive Learning for Microblog Classification",

abstract = "Microblog content (e.g., Tweets) is noisy due to its informal use of language and its lack of contextual information within each post. To tackle these challenges, state-of-the-art microblog classification models rely on pre-training language models (LMs). However, pre-training dedicated LMs is resource-intensive and not suitable for small labs. Supervised contrastive learning (SCL) has shown its effectiveness with small, available resources. In this work, we examine the effectiveness of fine-tuning transformer-based language models, regularized with a SCL loss for English microblog classification. Despite its simplicity, the evaluation on two English microblog classification benchmarks (TweetEval and Tweet Topic Classification) shows an improvement over baseline models. The result shows that, across all subtasks, our proposed method has a performance gain of up to 11.9 percentage points. All our models are open source.",

keywords = "Business informatics",

author = "Junbo Huang and Ricardo Usbeck",

note = "Publisher Copyright: {\textcopyright} 2024 Association for Computational Linguistics.; Conference on Empirical Methods in Natural Language Processing - EMNLP 2024, EMNLP 2024 ; Conference date: 12-11-2024 Through 16-11-2024",

year = "2024",

doi = "10.18653/v1/2024.emnlp-main.876",

language = "English",

pages = "15644--15653",

editor = "Yaser Al-Onaizan and Mohit Bansal and Yun-Nung Chen",

booktitle = "The 2024 Conference on Empirical Methods in Natural Language Processing",

publisher = "Association for Computational Linguistics",

address = "United States",

url = "https://2024.emnlp.org/",

}

RIS

TY - CHAP

T1 - Revisiting Supervised Contrastive Learning for Microblog Classification

AU - Huang, Junbo

AU - Usbeck, Ricardo

N1 - Conference code: 29

PY - 2024

Y1 - 2024

N2 - Microblog content (e.g., Tweets) is noisy due to its informal use of language and its lack of contextual information within each post. To tackle these challenges, state-of-the-art microblog classification models rely on pre-training language models (LMs). However, pre-training dedicated LMs is resource-intensive and not suitable for small labs. Supervised contrastive learning (SCL) has shown its effectiveness with small, available resources. In this work, we examine the effectiveness of fine-tuning transformer-based language models, regularized with a SCL loss for English microblog classification. Despite its simplicity, the evaluation on two English microblog classification benchmarks (TweetEval and Tweet Topic Classification) shows an improvement over baseline models. The result shows that, across all subtasks, our proposed method has a performance gain of up to 11.9 percentage points. All our models are open source.

AB - Microblog content (e.g., Tweets) is noisy due to its informal use of language and its lack of contextual information within each post. To tackle these challenges, state-of-the-art microblog classification models rely on pre-training language models (LMs). However, pre-training dedicated LMs is resource-intensive and not suitable for small labs. Supervised contrastive learning (SCL) has shown its effectiveness with small, available resources. In this work, we examine the effectiveness of fine-tuning transformer-based language models, regularized with a SCL loss for English microblog classification. Despite its simplicity, the evaluation on two English microblog classification benchmarks (TweetEval and Tweet Topic Classification) shows an improvement over baseline models. The result shows that, across all subtasks, our proposed method has a performance gain of up to 11.9 percentage points. All our models are open source.

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=85217771584&partnerID=8YFLogxK

U2 - 10.18653/v1/2024.emnlp-main.876

DO - 10.18653/v1/2024.emnlp-main.876

M3 - Article in conference proceedings

SP - 15644

EP - 15653

BT - The 2024 Conference on Empirical Methods in Natural Language Processing

A2 - Al-Onaizan, Yaser

A2 - Bansal, Mohit

A2 - Chen, Yun-Nung

PB - Association for Computational Linguistics

CY - Kerrville

T2 - Conference on Empirical Methods in Natural Language Processing - EMNLP 2024

Y2 - 12 November 2024 through 16 November 2024

ER -

Other publications by the same author(s)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Salnikov, M., Sakhovskiy, A., Nikishina, I., Usmanova, A., Kraft, A., Möller, C., Banerjee, D., Huang, J., Jiang, L., Abdullah, R., Yan, X., Tutubalina, E., Usbeck, R. & Panchenko, A., 2026, Natural Language Processing and Information Systems: 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Proceedings. Ichise, R. (ed.). Springer Science and Business Media Deutschland, p. 95-110 16 p. (Lecture Notes in Computer Science; vol. 15836 LNCS).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

Möller, C. & Usbeck, R., 2025, The Semantic Web: 22nd European Semantic Web Conference, ESWC 2025 Portoroz, Slovenia, June 1–5, 2025 Proceedings, Part I. Curry, E., Acosta, M., Poveda-Villalón, M., van Erp, M., Ojo, A., Hose, K., Shimizu, C. & Lisena, P. (eds.). Cham: Springer Nature Switzerland AG, Vol. 1. p. 460-480 21 p. (Lecture Notes in Computer Science ; vol. 15718).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Automating SPARQL Query Translations between DBpedia and Wikidata

Bartels, M. C., Banerjee, D. & Usbeck, R., 14.07.2025, SEMANTiCS Conference 2025.

Research output: Contributions to collected editions/works › Article in conference proceedings › Research

Bridge-Generate: Scholarly Hybrid Question Answering

Taffa, T. A. & Usbeck, R., 23.05.2025, WWW Companion 2025 - Companion Proceedings of the ACM Web Conference 2025: Companion Proceedings of the ACM Web Conference 2025, April 28-May 2, 2025 Sydney, NSW, Australia. Long, G., Blumestein, M., Chang, Y., Lewin-Eytan, L., Huang, H. & Yom-Tov, E. (eds.). New York: Association for Computing Machinery, Inc, p. 1321-1325 5 p.

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Junior fellows and distinguished dissertation of the GI and AI for crisis

Usbeck, R., Kraft, A. & Westphal, P., 01.02.2025, In: IT - Information Technology. 67, 1, p. 1-2 2 p.

Research output: Journal contributions › Other (editorial matter etc.) › Research

DOI

https://doi.org/10.18653/v1/2024.emnlp-main.876
Final published version

Revisiting Supervised Contrastive Learning for Microblog Classification

Standard

Harvard

APA

Vancouver

Bibtex

RIS

Other publications by the same author(s)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

Automating SPARQL Query Translations between DBpedia and Wikidata

Bridge-Generate: Scholarly Hybrid Question Answering

Junior fellows and distinguished dissertation of the GI and AI for crisis

DOI

Recently viewed

Projects

Activities

Publications