Ablation Study of a Multimodal Gat Network on Perfect Synthetic and Real-world Data to Investigate the Influence of Language Models in Invoice Recognition

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Document analysis and invoice recognition have been significantly advanced in recent years by grid-based, graph-based and transformer architectures. However, it is not only the model architecture that influences an approach’s results, but also the quality of training and test data. In this paper, we perform an ablation study on an existing state-of-the-art pre-trained multimodal GAT network. Therein we investigate two kinds of modifications to understand the sensitivity of the results by (1) exchanging the language module and (2) applying both the original and modified network on a perfect synthetic and an imperfect real-world dataset. The results of the study show the importance of language modules for semantic embeddings in multimodal invoice recognition and illustrate the impact of data annotation quality. We further contribute an adapted GAT model for German invoices.

Original languageEnglish
Title of host publicationDocument Analysis and Recognition – ICDAR 2024 Workshops : Athens, Greece, August 30–31, 2024 Proceedings, Part II
EditorsHarold Mouchère, Anna Zhu
Number of pages14
Volume2
Place of PublicationCham
PublisherSpringer Nature
Publication date11.09.2024
Pages199-212
ISBN (print)978-3-031-70641-7
ISBN (electronic)978-3-031-70642-4
DOIs
Publication statusPublished - 11.09.2024
EventInternational Workshops co-located with the 18th International Conference on Document Analysis and Recognition - ICDAR 2024 - Athens, Greece
Duration: 30.08.202431.08.2024
https://icdar2024.net/

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

    Research areas

  • GAT, GraphDoc, Inv3D, Invoice recognition, Synthetic data
  • Informatics

Recently viewed

Activities

  1. Is Transaction Cost Theory a useful Perspective for Make-and-Buy?
  2. Computersimulation als Erkenntnismethode
  3. Blogs in the Foreign Language Classroom
  4. User Generated Content
  5. Forest pedagogics in a global context – implemented locally
  6. Crazy, Classified City Life - Hackfeminist Future-Making Practices between Dystopia and Utopia, Predictability and Possibility
  7. Methodology, Resources, Issues, and Challenges in Nazi-Era Provenance Research
  8. Negotiating normativity: discourses of (non) belonging and (non) coincidences in the context of transnational adoption
  9. Veranstaltung der Friedrich-Ebert-Stiftung - 2009
  10. What is in the attention of pre-service teachers while teaching? An eye-tracking study about attention processes during standardized teaching situations
  11. Identification of photo-transformation products of ciprofloxacin and evaluation of their genotoxicity using in silco methods and in vitro assay
  12. Mathematik und Sprache
  13. The International Congress on the Problem Solving in Mathematics, ProMath 2004.
  14. How is Research Creation as Other Knowledge?
  15. Transdisciplinary Evaluation of Different Coastal Adaptation Strategies: Integrating Regional Perceptions of Scientists, Practitioners and the Public
  16. First-Mover Advantages in the international Diffusion of internet-based Business Models
  17. What could university look like in the 21st Century. The Global Classroom Curriculum
  18. Socio-technical Instruments for Integrated River Basin Management
  19. ESU 2021 Conference and Doctoral Program on Entrepreneurship

Publications

  1. Timing matters: Distinct effects of nitrogen and phosphorus fertilizer application timing on root system architecture responses
  2. Conditions of One-Way and Two-Way Approaches in Strategic Start-Up Communication
  3. Dynamic performance
  4. Perfectly nested or significantly nested - an important difference for conservation management
  5. Proof of concept
  6. Downsizing, Ideology and Contracts
  7. Optimal scheduling of AGVs in a reentrant blocking job-shop
  8. Rethinking Economic Practices and Values As Assemblages of More-Than-Human Relations
  9. Emergence of Responsiveness Across Organizations, Networks, and Clusters from a Dynamic Capability Perspective
  10. Learning with summaries
  11. Facing Up to Third Party Liability for Space Activities
  12. Digital Workplace Transformation Triggers a Shift in the HR Identity
  13. BUSINESS MODELS IN BANKING: A CLUSTER ANALYSIS USING ARCHIVAL DATA
  14. Perception of Space and Time in a Created Environment
  15. Requests for mathematical reasoning in textbooks for primary-level students
  16. Dealing with inclusion–teachers’ assessment of internal and external resources
  17. Electrical and Mechanical Characterization of Polymer Nanofibers for Sensor Application
  18. Daily breath-based mindfulness exercises in a randomized controlled trial improve primary school children’s performance in arithmetic
  19. Einführung in die systemnahe Programmierung
  20. Effect of grain size and structure, solid solution elements, precipitates and twinning on nanohardness of Mg-Re alloys
  21. Predictive Maintenance of Bearings Through IoT and Cloud-Based Systems
  22. Number theoretical peculiarities in the dimension theory of dynamical systems
  23. Managing Green Business Model Transformations
  24. Simultaneity and temporal order perception: different sides of the same coin?
  25. DECODING SUSTAINABILITY IN THE HEALTHCARE SYSTEM. TEACHING STUDENTS HOW TO PROBLEMATIZE COMPLEX CONCEPTS
  26. Mathematical Chemistry and Chemoinformatics
  27. Medienpolitik in der EU
  28. Ontology-Guided, Hybrid Prompt Learning for Generalization in Knowledge Graph Question Answering