Ablation Study of a Multimodal Gat Network on Perfect Synthetic and Real-world Data to Investigate the Influence of Language Models in Invoice Recognition

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Document analysis and invoice recognition have been significantly advanced in recent years by grid-based, graph-based and transformer architectures. However, it is not only the model architecture that influences an approach’s results, but also the quality of training and test data. In this paper, we perform an ablation study on an existing state-of-the-art pre-trained multimodal GAT network. Therein we investigate two kinds of modifications to understand the sensitivity of the results by (1) exchanging the language module and (2) applying both the original and modified network on a perfect synthetic and an imperfect real-world dataset. The results of the study show the importance of language modules for semantic embeddings in multimodal invoice recognition and illustrate the impact of data annotation quality. We further contribute an adapted GAT model for German invoices.

Original languageEnglish
Title of host publicationDocument Analysis and Recognition – ICDAR 2024 Workshops : Athens, Greece, August 30–31, 2024 Proceedings, Part II
EditorsHarold Mouchère, Anna Zhu
Number of pages14
Volume2
Place of PublicationCham
PublisherSpringer Nature
Publication date11.09.2024
Pages199-212
ISBN (print)978-3-031-70641-7
ISBN (electronic)978-3-031-70642-4
DOIs
Publication statusPublished - 11.09.2024
EventInternational Workshops co-located with the 18th International Conference on Document Analysis and Recognition - ICDAR 2024 - Athens, Greece
Duration: 30.08.202431.08.2024
https://icdar2024.net/

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

    Research areas

  • GAT, GraphDoc, Inv3D, Invoice recognition, Synthetic data
  • Informatics

Recently viewed

Researchers

  1. Pascal Frank

Activities

  1. Sustainability on Campus - Overview, Implementation and Outlook
  2. Development and validation of a video-based instrument for the assessment of feedback competence.
  3. Co-creation in Open Strategy and Entrepreneurship
  4. Unintended Consequences of Field Experiments in Poverty Settings
  5. Blogs in the Foreign Language Classroom
  6. The more attractive the more effective? Investigating the association of user experience and efficacy of an online and app-based gratitude intervention to reduce repetitive negative thinking
  7. Measurement of Perceived Mental Strain and Physical Exertion Using the Category Partitioning Procedure
  8. User Generated Content
  9. International Astronautical Federation (IAF) (Externe Organisation)
  10. Enriching Higher Education through community engagement: Networking for Sustainable Development
  11. Evidence-based governance or governance learning? How policy-makers design participation processes for EU Floods Directive implementation
  12. Eine Podiumsdiskussion zu Fracking
  13. CeBIT 2014
  14. Forest pedagogics in a global context – implemented locally
  15. “The Bigger Picture of Corruption: A Comparative Analysis of Europe and the Rest of the World”, 03.03.2014.
  16. „Don't forget: the archive!“ – Collecting Non-Archives for the Post-Media Condition - 2013
  17. Struktur – Institution – Organisation
  18. Conference Presentation: Engaging the Listener in Your Talk
  19. Paper, pegboard, software: Elements of a media theory of organization
  20. Quantencomputer. Taktlos. „Kulturtechniken der Synchronisation” - 2007
  21. Bacillus pseudofirmus AL-89: A source for industrial relevant proteases
  22. Do we need a new paradigm for mastering existing and future challenges of the urban water cycle

Publications

  1. Non-technical success factors for bioenergy projects-Learning from a multiple case study in Japan
  2. Effect of internal defects on tensile properties of A356 casting alloys
  3. Nest site selection and the effects of land use in a multi-scale approach on the distribution of a passerine in an island arid environment
  4. Forging of Mg–3Sn–2Ca–0.4Al Alloy Assisted by Its Processing Map and Validation Through Analytical Modeling
  5. Self-Regulated Learning with Expository Texts as a Competence
  6. Reduction of capital tie up for assembly processes
  7. Phosphorus uptake from struvite is modulated by the nitrogen form applied
  8. Tree mixtures mediate negative effects of introduced tree species on bird taxonomic and functional diversity
  9. The Politics of (Non)Knowledge in the (Un)Making of Migration
  10. Examining how AI capabilities can foster organizational performance in public organizations
  11. Modeling Converging Material Flows In The Supply Chain
  12. Timing matters: Distinct effects of nitrogen and phosphorus fertilizer application timing on root system architecture responses
  13. Conditions of One-Way and Two-Way Approaches in Strategic Start-Up Communication
  14. CHANGING RECREATIONAL ACTIVITIES FOR REDUCING INSOMNIA SEVERITY? RESULTS FROM A SERIAL MEDIATION ANALYSIS ON THE IMPACT OF RECREATIONAL BEHAVIOR AS A MECHANISM OF CHANGE IN DIGITAL INTERVENTIONS FOR INSOMNIA
  15. Perfectly nested or significantly nested - an important difference for conservation management
  16. Anisotropy and mechanical properties of dissimilar Al additive manufactured structures generated by multi-layer friction surfacing
  17. Proof of concept
  18. Developing a Process for the Analysis of User Journeys and the Prediction of Dropout in Digital Health Interventions:
  19. Learning shortest paths in word graphs
  20. Downsizing, Ideology and Contracts
  21. Article 11 Formal Validity
  22. Analysis of the relevance of models, influencing factors and the point in time of the forecast on the prediction quality in order-related delivery time determination using machine learning