Extraction of information from invoices - challenges in the extraction pipeline

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review


Data from invoices are key information for business processes. In order to use the data and create business value, the information must be captured in a digital and structured form. Leveraging digital tools and AI/ML is state-of-The-Art in the extraction of information from invoices. However, the existing approaches are trained on specific languages and layouts, and while focusing on the performance of individual metrics, they neglect the demonstration of the pipeline from raw data to processable information. In this paper, we investigate the types of information on invoices and address the challenges in the extraction pipeline. We contribute by providing a morphological framework for the problematization and design of a pipeline as part of a design science study.

Original languageEnglish
Title of host publicationINFORMATIK 2023 : Designing Futures: Zukünfte gestalten, 26. – 29. September 2023, Berlin
EditorsMaike Klein, Daniel Krupka, Cornelia Winter, Volker Wohlgemuth
Number of pages16
Place of PublicationBonn
PublisherGesellschaft für Informatik e.V.
Publication date2023
ISBN (Electronic)978-3-88579-731-9
Publication statusPublished - 2023
Event53. Annual Meeting of the German Informatics Society (GI) - INFORMATICS 2023: Designing Futures - Zukünfte Gestalten - Online & HTW Berlin, Berlin, Germany
Duration: 26.09.202329.09.2023
Conference number: 53

Bibliographical note

Publisher Copyright:
© 2023 Gesellschaft fur Informatik (GI). All rights reserved.