Information Extraction from Invoices: A Graph Neural Network Approach for Datasets with High Layout Variety

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Extracting information from invoices is a highly structured, recurrent task in auditing. Automating this task would yield efficiency improvements, while simultaneously improving audit quality. The challenge for this endeavor is to account for the text layout on invoices and the high variety of layouts across different issuers. Recent research has proposed graphs to structurally represent the layout on invoices and to apply graph convolutional networks to extract the information pieces of interest. However, the effectiveness of graph-based approaches has so far been shown only on datasets with a low variety of invoice layouts. In this paper, we introduce a graph-based approach to information extraction from invoices and apply it to a dataset of invoices from multiple vendors. We show that our proposed model extracts the specified key items from a highly diverse set of invoices with a macro F 1 score of 0.8753.

Original languageEnglish
Title of host publicationInnovation Through Information Systems - Volume II : A Collection of Latest Research on Technology Issues
EditorsFrederik Ahlemann, Reinhard Schütte, Stefan Stieglitz
Number of pages16
Place of PublicationCham
PublisherSpringer Science and Business Media Deutschland
Publication date2021
Pages5-20
ISBN (print)978-3-030-86796-6
ISBN (electronic)978-3-030-86797-3
DOIs
Publication statusPublished - 2021

Bibliographical note

Publisher Copyright:
© 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Recently viewed

Publications

  1. Native vegetation cover thresholds associated with species responses
  2. Environmental rebound effect of energy efficiency improvements in Colombian households
  3. From niche to mainstream
  4. Integrating indigenous and local knowledge in management and research on coastal ecosystems in the Global South
  5. Mining for critical stock price movements using temporal power laws and integrated autoregressive models
  6. Transition management as an approach to deal with climate change
  7. Ambivalence in machine intelligence
  8. Development and application of green and sustainable analytical methods for flavonoid extraction from Passiflora waste
  9. Healthier and Sustainable Food Systems: Integrating Underutilised Crops in a ‘Theory of Change Approach’
  10. Environmentalitäre Zeit
  11. Microstructure and mechanical properties of Mg-3Sn-1Ca reinforced with AlN nano-particles
  12. Performance analysis of a thermochemical based heat storage as an addition to cogeneration systems
  13. Concept of a cloud state modeling system for lead-acid batteries
  14. Extraction of information from invoices - challenges in the extraction pipeline
  15. Donor Upgrading Strategies
  16. How development leads to democracy
  17. Integration of risk-oriented environmental management information systems and resource planning systems
  18. Fallstudie
  19. A comparative assessment of the transformation products of S-metolachlor and its commercial product Mercantor Gold® and their fate in the aquatic environment by employing a combination of experimental and in silico methods
  20. Acquisitional pragmatics
  21. Teaching the Teachers about Language Support Strategies
  22. Change in Women's Descriptive Representation and the Belief in Women's Ability to Govern: A Virtuous Cycle
  23. “Making Sense”
  24. Atlas mit CD-ROM
  25. Towards an Intra- and Interorganizational Perspective
  26. Towards 3D Process Simulation for In Situ Hybridization of Fiber-Metal-Laminates (FML)
  27. Determinants and Consequences of Executive Compensation-Related Shareholder Activism and Say-on-Pay Votes
  28. Sustainable university
  29. "Learning by doing"
  30. Design of a Master of Science Sustainable Chemistry
  31. Creative Work, Self-Organizing, and Autonomist Potentiality