Proxy Indicators for the Quality of Open-domain Dialogues

Rostislav Nedelchev; Jens Lehmann; Ricardo Usbeck

doi:10.18653/v1/2021.emnlp-main.618

Proxy Indicators for the Quality of Open-domain Dialogues

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Standard

Proxy Indicators for the Quality of Open-domain Dialogues. / Nedelchev, Rostislav; Lehmann, Jens; Usbeck, Ricardo.
EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings. ed. / Marie-Francine Moens; Xuanjing Huang; Lucia Specia; Scott Wen-tau Yih. Association for Computational Linguistics (ACL), 2021. p. 7834-7855 (EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Harvard

Nedelchev, R, Lehmann, J & Usbeck, R 2021, Proxy Indicators for the Quality of Open-domain Dialogues. in M-F Moens, X Huang, L Specia & S Wen-tau Yih (eds), EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings. EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, Association for Computational Linguistics (ACL), pp. 7834-7855, 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Punta Cana, Dominican Republic, 07.11.21. https://doi.org/10.18653/v1/2021.emnlp-main.618

APA

Nedelchev, R., Lehmann, J., & Usbeck, R. (2021). Proxy Indicators for the Quality of Open-domain Dialogues. In M.-F. Moens, X. Huang, L. Specia, & S. Wen-tau Yih (Eds.), EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 7834-7855). (EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-main.618

Vancouver

Nedelchev R, Lehmann J, Usbeck R. Proxy Indicators for the Quality of Open-domain Dialogues. In Moens MF, Huang X, Specia L, Wen-tau Yih S, editors, EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings. Association for Computational Linguistics (ACL). 2021. p. 7834-7855. (EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings). doi: 10.18653/v1/2021.emnlp-main.618

Bibtex

@inbook{2acc49ff1cb64885973f6c6f152d46bb,

title = "Proxy Indicators for the Quality of Open-domain Dialogues",

abstract = "The automatic evaluation of open-domain dialogues remains a largely unsolved challenge. Thus, despite the abundance of work done in the field, human judges have to evaluate dialogues' quality. As a consequence, performing such evaluations at scale is usually expensive. This work investigates using a deep-learning model trained on the General Language Understanding Evaluation (GLUE) benchmark to serve as a quality indication of open-domain dialogues. The aim is to use the various GLUE tasks as different perspectives on judging the quality of conversation, thus reducing the need for additional training data or responses that serve as quality references. Due to this nature, the method can infer various quality metrics and derive a component-based overall score. We achieve statistically significant correlation coefficients of up to 0.7.",

keywords = "Informatics, Business informatics",

author = "Rostislav Nedelchev and Jens Lehmann and Ricardo Usbeck",

note = "Publisher Copyright: {\textcopyright} 2021 Association for Computational Linguistics; 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 ; Conference date: 07-11-2021 Through 11-11-2021",

year = "2021",

month = jan,

day = "1",

doi = "10.18653/v1/2021.emnlp-main.618",

language = "English",

series = "EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings",

publisher = "Association for Computational Linguistics (ACL)",

pages = "7834--7855",

editor = "Marie-Francine Moens and Xuanjing Huang and Lucia Specia and {Wen-tau Yih}, Scott",

booktitle = "EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings",

address = "United States",

url = "https://2021.emnlp.org",

}

RIS

TY - CHAP

T1 - Proxy Indicators for the Quality of Open-domain Dialogues

AU - Nedelchev, Rostislav

AU - Lehmann, Jens

AU - Usbeck, Ricardo

PY - 2021/1/1

Y1 - 2021/1/1

N2 - The automatic evaluation of open-domain dialogues remains a largely unsolved challenge. Thus, despite the abundance of work done in the field, human judges have to evaluate dialogues' quality. As a consequence, performing such evaluations at scale is usually expensive. This work investigates using a deep-learning model trained on the General Language Understanding Evaluation (GLUE) benchmark to serve as a quality indication of open-domain dialogues. The aim is to use the various GLUE tasks as different perspectives on judging the quality of conversation, thus reducing the need for additional training data or responses that serve as quality references. Due to this nature, the method can infer various quality metrics and derive a component-based overall score. We achieve statistically significant correlation coefficients of up to 0.7.

AB - The automatic evaluation of open-domain dialogues remains a largely unsolved challenge. Thus, despite the abundance of work done in the field, human judges have to evaluate dialogues' quality. As a consequence, performing such evaluations at scale is usually expensive. This work investigates using a deep-learning model trained on the General Language Understanding Evaluation (GLUE) benchmark to serve as a quality indication of open-domain dialogues. The aim is to use the various GLUE tasks as different perspectives on judging the quality of conversation, thus reducing the need for additional training data or responses that serve as quality references. Due to this nature, the method can infer various quality metrics and derive a component-based overall score. We achieve statistically significant correlation coefficients of up to 0.7.

KW - Informatics

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=85127432288&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/87ca9f87-497d-31ee-9b3d-c0876c35cb07/

U2 - 10.18653/v1/2021.emnlp-main.618

DO - 10.18653/v1/2021.emnlp-main.618

M3 - Article in conference proceedings

AN - SCOPUS:85127432288

T3 - EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings

SP - 7834

EP - 7855

BT - EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings

A2 - Moens, Marie-Francine

A2 - Huang, Xuanjing

A2 - Specia, Lucia

A2 - Wen-tau Yih, Scott

PB - Association for Computational Linguistics (ACL)

T2 - 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021

Y2 - 7 November 2021 through 11 November 2021

ER -

Other publications by the same author(s)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Salnikov, M., Sakhovskiy, A., Nikishina, I., Usmanova, A., Kraft, A., Möller, C., Banerjee, D., Huang, J., Jiang, L., Abdullah, R., Yan, X., Tutubalina, E., Usbeck, R. & Panchenko, A., 2026, Natural Language Processing and Information Systems: 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Proceedings. Ichise, R. (ed.). Springer Science and Business Media Deutschland, p. 95-110 16 p. (Lecture Notes in Computer Science; vol. 15836 LNCS).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

Möller, C. & Usbeck, R., 2025, The Semantic Web: 22nd European Semantic Web Conference, ESWC 2025 Portoroz, Slovenia, June 1–5, 2025 Proceedings, Part I. Curry, E., Acosta, M., Poveda-Villalón, M., van Erp, M., Ojo, A., Hose, K., Shimizu, C. & Lisena, P. (eds.). Cham: Springer Nature Switzerland AG, Vol. 1. p. 460-480 21 p. (Lecture Notes in Computer Science ; vol. 15718).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

ASK-DBLP: Answering Questions over DBLP

Taffa, T., Neises, P., Ollinger, S., Westphal, P., Ackermann, M. R., Banerjee, D. & Usbeck, R., 02.11.2025, ISWC-C 2025, Industry, Doctoral Consortium, Posters and Demos at ISWC 2025: Joint Proceedings of Industry, Doctoral Consortium, Posters and Demos of the 24th International Semantic Web Conference (ISWC-C 2025), ISWC 2025 Companion Volume. Celino, I., Hassanzadeh, O., Bernstein, A., Noy, N., Cheng, G., Wang, S., Ferrada, S., Soulard, T., Kozaki, K., Takeda, H. & Gentile, A. L. (eds.). Aachen: Sun Site Central Europe (RWTH Aachen University), p. 435-440 6 p. D13. (CEUR Workshop Proceedings; vol. 4085).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Automating SPARQL Query Translations between DBpedia and Wikidata

Bartels, M. C., Banerjee, D. & Usbeck, R., 14.07.2025, Linking Meaning: Semantic Technologies Shaping the Future of AI: Cover 74617 Proceedings of the 21st International Conference on Semantic Systems, 3-5 September 2025, Vienna, Austria. Spahiu, B., Vahdati, S., Salatino, A., Pellegrini, T. & Havur, G. (eds.). IOS Press BV, p. 176-193 18 p. (Studies on the Semantic Web; vol. 62).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research

Best Practices in AI and Data Science Models Evaluation

Banerjee, D., Taffa, T. A. & Usbeck, R., 2025, INFORMATIK 2025 : The Wide Open - Offenheit von Source bis Science, 16.-19.September 2025 Potsdam. Lucke, U., Stieglitz, S., Uebernickel, F., Lamprecht, A.-L. & Klein, M. (eds.). Bonn: Gesellschaft für Informatik, Bonn, p. 1211-1219 9 p. (Lecture Notes in Informatics; vol. P366).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

DOI

https://doi.org/10.18653/v1/2021.emnlp-main.618
Final published version

Proxy Indicators for the Quality of Open-domain Dialogues

Standard

Harvard

APA

Vancouver

Bibtex

RIS

Other publications by the same author(s)

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Analyzing the Influence of Knowledge Graph Information on Relation Extraction

ASK-DBLP: Answering Questions over DBLP

Automating SPARQL Query Translations between DBpedia and Wikidata

Best Practices in AI and Data Science Models Evaluation

DOI

Recently viewed

Activities

Press / Media

Publications