Extraction of information from invoices - challenges in the extraction pipeline

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Data from invoices are key information for business processes. In order to use the data and create business value, the information must be captured in a digital and structured form. Leveraging digital tools and AI/ML is state-of-The-Art in the extraction of information from invoices. However, the existing approaches are trained on specific languages and layouts, and while focusing on the performance of individual metrics, they neglect the demonstration of the pipeline from raw data to processable information. In this paper, we investigate the types of information on invoices and address the challenges in the extraction pipeline. We contribute by providing a morphological framework for the problematization and design of a pipeline as part of a design science study.

Original languageEnglish
Title of host publicationINFORMATIK 2023 : Designing Futures: Zukünfte gestalten, 26. – 29. September 2023, Berlin
EditorsMaike Klein, Daniel Krupka, Cornelia Winter, Volker Wohlgemuth
Number of pages16
Place of PublicationBonn
PublisherGesellschaft für Informatik e.V.
Publication date2023
Pages1777-1792
ISBN (electronic)978-3-88579-731-9
DOIs
Publication statusPublished - 2023
Event53. Annual Meeting of the German Informatics Society (GI) - INFORMATICS 2023: Designing Futures - Zukünfte Gestalten - Online & HTW Berlin, Berlin, Germany
Duration: 26.09.202329.09.2023
Conference number: 53
https://informatik2023.gi.de/

Bibliographical note

Publisher Copyright:
© 2023 Gesellschaft fur Informatik (GI). All rights reserved.

DOI

Recently viewed

Publications

  1. Insights into Jatropha Projects Worldwide
  2. The 1986 Principles Relating to Remote Sensing of the Earth from Outer Space (RS Princi­ples)
  3. Fluid-structure interaction modelling of a soft pneumatic actuator
  4. Global patterns of ecologically unequal exchange
  5. Robust and Optimal Control Designed for Autonomous Surface Vessel Prototypes
  6. Case study: The development of a multi-material heat sink by Additive Manufacturing using Aerosint technology
  7. Microstructure and mechanical properties of Mg-3Sn-1Ca reinforced with AlN nano-particles
  8. Evaluating the effectiveness of retention forestry to enhance biodiversity in production forests of Central Europe using an interdisciplinary, multi-scale approach
  9. Health and the intention to retire: exploring the moderating effects of human resources practices
  10. From simulation to real-world robotic mobile fulfillment systems
  11. Germination performance of native and non-native Ulmus pumila populations
  12. Explaining Investment Dynamics: Empirical Evidence from German New Ventures
  13. Fallstudie
  14. Predicting Travel Patterns of Senior Citizens
  15. Active First Movers vs. Late Free-Riders? An Empirical Analysis of UN PRI Signatories' Commitment
  16. Development and application of green and sustainable analytical methods for flavonoid extraction from Passiflora waste
  17. Mapping the vegetation of southern mongolian protected areas: application of GIS and remote sensing techniques
  18. Degradation of 5-FU by means of advanced (photo)oxidation processes
  19. Analysis of Dynamic Response of a Two Degrees of Freedom (2-DOF) Ball Bearing Nonlinear Model
  20. Lizard distribution patterns in the Tumut Fragmentation "Natural Experiment" in south-eastern Australia
  21. Does the introduction of the Euro have an effect on subjective hypotheses about the price-quality relationship?
  22. Systemanalyse für Softwaresysteme
  23. Interfaces Ludiques
  24. An empirical note on commuting distance and sleep during workweek and weekend
  25. Feld oder Assemblage?
  26. SemREC-SMART 2022
  27. Reference wages and turnover intentions
  28. Comparing apples with oranges? An approach to link TIMSS and the National Educational Panel Study in Germany via equipercentile and IRT methods
  29. Digital Classroom
  30. The dynamics of prioritizing
  31. Comprehension of climate change and environmental attitudes across the lifespan
  32. Out of the box
  33. Dimensions of digital transformation in the context of modern agriculture
  34. Achieving consumer trust on Twitter via CSR communication