RoMe: A Robust Metric for Evaluating Natural Language Generation

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

  • Md Rashad Al Hasan Rony
  • Liubov Kovriguina
  • Debanjan Chaudhuri
  • Ricardo Usbeck
  • Jens Lehmann

Evaluating Natural Language Generation (NLG) systems is a challenging task. Firstly, the metric should ensure that the generated hypothesis reflects the reference's semantics. Secondly, it should consider the grammatical quality of the generated sentence. Thirdly, it should be robust enough to handle various surface forms of the generated sentence. Thus, an effective evaluation metric has to be multifaceted. In this paper, we propose an automatic evaluation metric incorporating several core aspects of natural language understanding (language competence, syntactic and semantic variation). Our proposed metric, RoMe, is trained on language features such as semantic similarity combined with tree edit distance and grammatical acceptability, using a self-supervised neural network to assess the overall quality of the generated sentence. Moreover, we perform an extensive robustness analysis of the state-of-the-art methods and RoMe. Empirical results suggest that RoMe has a stronger correlation to human judgment over state-of-the-art metrics in evaluating system-generated sentences across several NLG tasks.

Original languageEnglish
Title of host publicationACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
EditorsSmaranda Muresan, Preslav Nakov, Aline Villavicencio
Number of pages13
PublisherAssociation for Computational Linguistics (ACL)
Publication date2022
Pages5645-5657
ISBN (electronic)9781955917216
DOIs
Publication statusPublished - 2022
Externally publishedYes
Event60th Annual Meeting of the Association for Computational Linguistics - ACL 2022 - Convention Centre Dublin & Online, Dublin, Ireland
Duration: 22.05.202227.05.2022
Conference number: 60
https://www.2022.aclweb.org/

Bibliographical note

Publisher Copyright:
© 2022 Association for Computational Linguistics.

Recently viewed

Publications

  1. Extent, perception and mitigation of damage due to high groundwater levels in the city of Dresden, Germany
  2. Gesellschaftliche Partizipation an Technisierungsprozessen
  3. Selbstorganisation managen
  4. Fate of Pesticides and Their Transformation Products
  5. Public Value - Gesellschaftliche Wertschöpfung als unternehmerische Pflicht
  6. Gentelligent Factory Structures and Assembly Control
  7. Joint production and responsibility in ecological economics
  8. A systematic review of the impact of mindfulness on the well-being of healthcare professionals
  9. From ruins and rubble: promised and suspended futures in Kenya (and beyond)
  10. Organisatorische Änderungsprozesse
  11. Governance for achieving the Sustainable Development Goals
  12. Der Human-Potential-Index (HPI)
  13. Beschreibung zentraler mathematischer Kompetenzen
  14. High temperature strength and hot working technology for As-cast Mg-1Zn-1Ca (ZX11) alloy
  15. An Adaptive Lyapunovs Internal PID Regulator in Automotive Applications
  16. When Bees Smell Like Trees
  17. Zusammenhänge und Mechanismen
  18. Hafenstädte
  19. Religion als performative Praxis im Jugendalter
  20. Europa professionalisieren
  21. Existenzgründungsberatung
  22. Biodegradability of the X-Ray Contrast Compound Diatrizoic Acid, Identification of Aerobic Degradation Products and Effects against Sewage Sludge Micro-Organisms
  23. Exporte und Produktivität in mittelständischen Betrieben
  24. Arbeitsmotivation
  25. § 25 Klärgas
  26. Psychological wellbeing and academic experience of university students in australia during covid-19
  27. Geschlechtsbewusste Gewaltprävention
  28. Zwischen Euphorie und Skepsis
  29. Foresight
  30. German works councils in the production process
  31. Quo vadis Kreditwirtschaft
  32. Repräsentation, Krise der Repräsentation, Pradigmenwechsel
  33. A Formação no tempo e no espaço da internet das coisas
  34. Ethnologie
  35. Vorlesungen
  36. Learning to say 'you' in German
  37. Sustainability Index to Assess the Environmental Impact of Heat Supply Systems
  38. Conflicts over coastal protection in a National Park: Mediation and negotiated law making
  39. § 23 Wasserkraft
  40. Nachweis der Anspruchshöhe bei gekündigtem Einheitspreisvertrag
  41. MITAX - Mikroanalysen und Steuerpolitik
  42. Zu einer Theorie Allgemeiner Handlungssysteme
  43. Wahlen in der Ukraine
  44. Existenzgründung als geplantes Verhalten
  45. 44 VwGO: objektive Klagehäufung