Proxy Indicators for the Quality of Open-domain Dialogues

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

The automatic evaluation of open-domain dialogues remains a largely unsolved challenge. Thus, despite the abundance of work done in the field, human judges have to evaluate dialogues' quality. As a consequence, performing such evaluations at scale is usually expensive. This work investigates using a deep-learning model trained on the General Language Understanding Evaluation (GLUE) benchmark to serve as a quality indication of open-domain dialogues. The aim is to use the various GLUE tasks as different perspectives on judging the quality of conversation, thus reducing the need for additional training data or responses that serve as quality references. Due to this nature, the method can infer various quality metrics and derive a component-based overall score. We achieve statistically significant correlation coefficients of up to 0.7.

OriginalspracheEnglisch
TitelEMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
HerausgeberMarie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Anzahl der Seiten22
VerlagAssociation for Computational Linguistics (ACL)
Erscheinungsdatum01.01.2021
Seiten7834-7855
ISBN (elektronisch)9781955917094
DOIs
PublikationsstatusErschienen - 01.01.2021
Extern publiziertJa
Veranstaltung2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 - ONLINE, Punta Cana, Dominikanische Republik
Dauer: 07.11.202111.11.2021
https://2021.emnlp.org

Bibliographische Notiz

Publisher Copyright:
© 2021 Association for Computational Linguistics

DOI

Zuletzt angesehen

Publikationen

  1. Tree diversity alters the structure of a tri-trophic network in a biodiversity experiment
  2. Integrating methods for ecosystem service assessment
  3. Epistemic Domination by Data Extraction
  4. Assessing mire-specific biodiversity with an indicator based approach
  5. Capitalizing on natural language processing (NLP) to automate the evaluation of coach implementation fidelity in guided digital cognitive-behavioral therapy (GdCBT)
  6. Unveiling local knowledge
  7. The dynamics of prior entry in serial visual processing
  8. It is not what it is
  9. Contrasting patterns of intraspecific trait variability in native and non-native plant species along an elevational gradient on Tenerife, Canary Islands
  10. The role of plant biodiversity in modifying the structure and functioning of higher tropic Levels in species-rich forests
  11. Creep behavior of AE42 based hybrid composites
  12. Maintaining the impact of action-oriented entrepreneurship training
  13. Identifying determinants of teachers' judgment (in)accuracy regarding students' school-related motivations using a Bayesian cross-classified multi-level model
  14. Towards a Comprehensive Framework for Environmental Management Accounting
  15. Predictive mapping of plant species and communities using GIS and Landsat data in a southern Mongolian mountain range
  16. Evolutionary clustering of Lagrangian trajectories in turbulent Rayleigh-Bénard convection flows
  17. Atlas mit CD-ROM
  18. End-users’ perspective on digitalization
  19. Challenges in political interviews
  20. A transdisciplinary evaluation framework for the assessment of integration in boundary-crossing collaborations in teacher education
  21. Acquisitional pragmatics
  22. Instruments for research on transition. Applied methods and approaches for exploring the transition of young care leavers to adulthood