Proxy Indicators for the Quality of Open-domain Dialogues

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

The automatic evaluation of open-domain dialogues remains a largely unsolved challenge. Thus, despite the abundance of work done in the field, human judges have to evaluate dialogues' quality. As a consequence, performing such evaluations at scale is usually expensive. This work investigates using a deep-learning model trained on the General Language Understanding Evaluation (GLUE) benchmark to serve as a quality indication of open-domain dialogues. The aim is to use the various GLUE tasks as different perspectives on judging the quality of conversation, thus reducing the need for additional training data or responses that serve as quality references. Due to this nature, the method can infer various quality metrics and derive a component-based overall score. We achieve statistically significant correlation coefficients of up to 0.7.

Original languageEnglish
Title of host publicationEMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
EditorsMarie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Number of pages22
PublisherAssociation for Computational Linguistics (ACL)
Publication date01.01.2021
Pages7834-7855
ISBN (electronic)9781955917094
DOIs
Publication statusPublished - 01.01.2021
Externally publishedYes
Event2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 - ONLINE, Punta Cana, Dominican Republic
Duration: 07.11.202111.11.2021
https://2021.emnlp.org

Bibliographical note

Publisher Copyright:
© 2021 Association for Computational Linguistics

Recently viewed

Publications

  1. Creating regional (e-)learning networks
  2. Perfectly nested or significantly nested - an important difference for conservation management
  3. An analytical approach to evaluating monotonic functions of fuzzy numbers
  4. Masked Autoencoder Pretraining for Event Classification in Elite Soccer
  5. Perception and Inference
  6. Action Errors, Error Management, and Learning in Organizations
  7. Differences in adjustment flexibility between regular and temporary agency work
  8. Modeling and simulation of size effects in metallic glasses with non-local continuum mechanics theory
  9. Assessing authenticity in modelling test items: deriving a theoretical model
  10. Twitter and its usage for dialogic stakeholder communication by MNCs and NGOs
  11. Doing space in face-to-face interaction and on interactive multimodal platforms
  12. Using Digitalization As An Enabler For Changeability In Production Systems In A Learning Factory Environment
  13. A PD regulator to minimize noise effect using a minimal variance method for soft landing control of an electromagnetic valve actuator
  14. The dynamics of prior entry in serial visual processing
  15. Differentiating forest types using TerraSAR–X spotlight images based on inferential statistics and multivariate analysis
  16. On the Appropriate Methodologies for Data Science Projects
  17. Grounds different from, though equally solid with
  18. How Much Home Office is Ideal? A Multi-Perspective Algorithm
  19. Handicaps in job assignment
  20. Adaptive wavelet methods for saddle point problems
  21. Modellieren in der Sekundarstufe
  22. Applied Conversation Analysis in Foreign Language Didactics
  23. The identification of up-And downstream industries using input-output tables and a firm-level application to minority shareholdings