Language Model Transformers as Evaluators for Open-domain Dialogues

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

Language Model Transformers as Evaluators for Open-domain Dialogues. / Nedelchev, Rostislav; Lehmann, Jens; Usbeck, Ricardo.
COLING 2020 - 28th International Conference on Computational Linguistics: Proceedings of the Conference. ed. / Donia Scott; Nuria Bel; Chengqing Zong. Association for Computational Linguistics (ACL), 2020. p. 6797-6808 (COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference).

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Nedelchev, R, Lehmann, J & Usbeck, R 2020, Language Model Transformers as Evaluators for Open-domain Dialogues. in D Scott, N Bel & C Zong (eds), COLING 2020 - 28th International Conference on Computational Linguistics: Proceedings of the Conference. COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference, Association for Computational Linguistics (ACL), pp. 6797-6808, 28th International Conference on Computational Linguistics, COLING 2020, Virtual, Online, Spain, 08.12.20. https://doi.org/10.18653/v1/2020.coling-main.599

APA

Nedelchev, R., Lehmann, J., & Usbeck, R. (2020). Language Model Transformers as Evaluators for Open-domain Dialogues. In D. Scott, N. Bel, & C. Zong (Eds.), COLING 2020 - 28th International Conference on Computational Linguistics: Proceedings of the Conference (pp. 6797-6808). (COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.599

Vancouver

Nedelchev R, Lehmann J, Usbeck R. Language Model Transformers as Evaluators for Open-domain Dialogues. In Scott D, Bel N, Zong C, editors, COLING 2020 - 28th International Conference on Computational Linguistics: Proceedings of the Conference. Association for Computational Linguistics (ACL). 2020. p. 6797-6808. (COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference). doi: 10.18653/v1/2020.coling-main.599

Bibtex

@inbook{060baa868fe74263b7f5495df8027644,
title = "Language Model Transformers as Evaluators for Open-domain Dialogues",
abstract = "Computer-based systems for communication with humans are a cornerstone of AI research since the 1950s. So far, the most effective way to assess the quality of the dialogues produced by these systems is to use resource-intensive manual labor instead of automated means. In this work, we investigate whether language models (LM) based on transformer neural networks can indicate the quality of a conversation. In a general sense, language models are methods that learn to predict one or more words based on an already given context. Due to their unsupervised nature, they are candidates for efficient, automatic indication of dialogue quality. We demonstrate that human evaluators have a positive correlation between the output of the language models and scores. We also provide some insights into their behavior and inner-working in a conversational context.",
keywords = "Informatics, Business informatics",
author = "Rostislav Nedelchev and Jens Lehmann and Ricardo Usbeck",
note = "We acknowledge the support of the EU projects Cleopatra (GA 812997) and TAILOR (GA 952215), the Federal Ministry for Economic Affairs and Energy (BMWi) project SPEAKER (FKZ 01MK20011A), the German Federal Ministry of Education and Research (BMBF) projects and excellence clusters ML2R (FKZ 01 15 18038 A/B/C), MLwin (01S18050 D/F), ScaDS.AI (01/S18026A) as well as the Fraunhofer Zukunftsstiftung project JOSEPH. Publisher Copyright: {\textcopyright} 2020 COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference. All rights reserved.; 28th International Conference on Computational Linguistics, COLING 2020 ; Conference date: 08-12-2020 Through 13-12-2020",
year = "2020",
month = jan,
day = "1",
doi = "10.18653/v1/2020.coling-main.599",
language = "English",
series = "COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference",
publisher = "Association for Computational Linguistics (ACL)",
pages = "6797--6808",
editor = "Donia Scott and Nuria Bel and Chengqing Zong",
booktitle = "COLING 2020 - 28th International Conference on Computational Linguistics",
address = "United States",
url = "https://coling2020.org, https://coling2020.org/COLING2020programme.pdf",

}

RIS

TY - CHAP

T1 - Language Model Transformers as Evaluators for Open-domain Dialogues

AU - Nedelchev, Rostislav

AU - Lehmann, Jens

AU - Usbeck, Ricardo

N1 - We acknowledge the support of the EU projects Cleopatra (GA 812997) and TAILOR (GA 952215), the Federal Ministry for Economic Affairs and Energy (BMWi) project SPEAKER (FKZ 01MK20011A), the German Federal Ministry of Education and Research (BMBF) projects and excellence clusters ML2R (FKZ 01 15 18038 A/B/C), MLwin (01S18050 D/F), ScaDS.AI (01/S18026A) as well as the Fraunhofer Zukunftsstiftung project JOSEPH. Publisher Copyright: © 2020 COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference. All rights reserved.

PY - 2020/1/1

Y1 - 2020/1/1

N2 - Computer-based systems for communication with humans are a cornerstone of AI research since the 1950s. So far, the most effective way to assess the quality of the dialogues produced by these systems is to use resource-intensive manual labor instead of automated means. In this work, we investigate whether language models (LM) based on transformer neural networks can indicate the quality of a conversation. In a general sense, language models are methods that learn to predict one or more words based on an already given context. Due to their unsupervised nature, they are candidates for efficient, automatic indication of dialogue quality. We demonstrate that human evaluators have a positive correlation between the output of the language models and scores. We also provide some insights into their behavior and inner-working in a conversational context.

AB - Computer-based systems for communication with humans are a cornerstone of AI research since the 1950s. So far, the most effective way to assess the quality of the dialogues produced by these systems is to use resource-intensive manual labor instead of automated means. In this work, we investigate whether language models (LM) based on transformer neural networks can indicate the quality of a conversation. In a general sense, language models are methods that learn to predict one or more words based on an already given context. Due to their unsupervised nature, they are candidates for efficient, automatic indication of dialogue quality. We demonstrate that human evaluators have a positive correlation between the output of the language models and scores. We also provide some insights into their behavior and inner-working in a conversational context.

KW - Informatics

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=85108285068&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/0f9694bb-370d-3c37-bb25-8347d9aac64a/

U2 - 10.18653/v1/2020.coling-main.599

DO - 10.18653/v1/2020.coling-main.599

M3 - Article in conference proceedings

AN - SCOPUS:85108285068

T3 - COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference

SP - 6797

EP - 6808

BT - COLING 2020 - 28th International Conference on Computational Linguistics

A2 - Scott, Donia

A2 - Bel, Nuria

A2 - Zong, Chengqing

PB - Association for Computational Linguistics (ACL)

T2 - 28th International Conference on Computational Linguistics, COLING 2020

Y2 - 8 December 2020 through 13 December 2020

ER -

Recently viewed

Activities

  1. Affective polarization and the support for different forms of governance. Evidence from Germany
  2. Mathematikdidaktisches Kolloquium
  3. Creating Space for Academic Feedom: Progressive Liberal Education in a German Public University
  4. Responsible Digital Transformation: Desperately Seeking Responsibility in (Green) Information Systems Research
  5. Networking in European Science Teacher Education – Presenting the Framework
  6. Das subversive Bild
  7. Interdisziplinäre Lehre in der Studieneingangsphase
  8. Seeing Ourselves as Dolphins See Us. John C. Lilly‘s Experiments on Interspecies Communication (Imagine! Applied Imagination, Visual Thinking and Creativity around 1960)
  9. Hamburger Fremdsprachentage
  10. 5 Switches – Digital Readiness Check for Companies
  11. Indigenous Protests and the Environment
  12. Challenging the functionality of audits: Examining the bureaucratization of risks of industrial accidents through the eyes of Franz Kafka, novelist and auditor
  13. Presentation of the ADORE-project
  14. Imaginaries of Disconnection
  15. A Soft Piezo Mechanical Hydraulic Actuator with its Liquid Stiffness Identification and its Control
  16. MpWH-Jahrestagung 2016 (Veranstaltung)
  17. Machbarkeitsstudie zu einem Visitor Welcome Center auf Usedom
  18. How to educate the educators of tomorrow. Teacher education for sustainable development – inquiries, insights and implications.
  19. Autumn School Infrastructures of Sense|Making 2020
  20. Geschlechternormen und Transformationen in Tunesien
  21. DigiSchreib - Ein Instrument zur Unterstützung von Lehrkräften bei Auswahl und Einsatz digitaler Schreibtools

Publications

  1. Pathways to Implementation: Evidence on How Participation in Environmental Governance Impacts on Environmental Outcomes
  2. How young children integrate information sources to infer the meaning of words
  3. Fallstudie
  4. Machine vision system for UAV navigation
  5. Evidence-Based Management and Organizational Reality
  6. Shifting Competency Requirements for IT Professionals in the Digital Transformation: A Competency Transformation Process Model
  7. Credit constraints and margins of import
  8. Fernsehen
  9. Multiple Coordination Patterns in Infant and Adult Vocalizations
  10. Operationalising the leverage points perspective for empirical research
  11. Corporate social responsibility performance, reporting and generalized methods of moments (GMM)
  12. Introduction - Teaching Artistic Strategies. Playing with Materiality, Aesthetics and Ambiguity
  13. Integration in Controllingsystemen
  14. Mapping giant honey bee nests in Palawan, Philippines through a transdisciplinary approach
  15. Several genes in Chlorella virus strain CVG-1 encode putative virion components
  16. Progress and challenge for magnesium alloys as biomaterials
  17. Pennycress-corn double-cropping increases ground beetle diversity
  18. Tracking the fate of aluminium in the eu using the matrace model
  19. Hermann Bahr
  20. Validation of the Behavioral Activation for Depression Scale (BADS)-Psychometric properties of the long and short form
  21. On the frontiers of collaboration and conflict: how context influences the success of collaboration
  22. Transferring biodiversity-ecosystem function research to the management of ‘real-world’ ecosystems
  23. Framework for the Parallelized Development of Estimation Tasks for Length, Area, Capacity and Volume in Primary School - A Pilot Study
  24. Erfolgreiche Business Intelligence-Projekte