Extraction of information from invoices - challenges in the extraction pipeline
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Standard
INFORMATIK 2023: Designing Futures: Zukünfte gestalten, 26. – 29. September 2023, Berlin. ed. / Maike Klein; Daniel Krupka; Cornelia Winter; Volker Wohlgemuth. Bonn: Gesellschaft für Informatik e.V., 2023. p. 1777-1792 (GI-Edition: Lecture Notes in Informatics (LNI), Proceedings; Vol. P-337).
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - CHAP
T1 - Extraction of information from invoices - challenges in the extraction pipeline
AU - Thiée, Lukas Walter
AU - Krieger, Felix
AU - Funk, Burkhardt
N1 - Conference code: 53
PY - 2023
Y1 - 2023
N2 - Data from invoices are key information for business processes. In order to use the data and create business value, the information must be captured in a digital and structured form. Leveraging digital tools and AI/ML is state-of-The-Art in the extraction of information from invoices. However, the existing approaches are trained on specific languages and layouts, and while focusing on the performance of individual metrics, they neglect the demonstration of the pipeline from raw data to processable information. In this paper, we investigate the types of information on invoices and address the challenges in the extraction pipeline. We contribute by providing a morphological framework for the problematization and design of a pipeline as part of a design science study.
AB - Data from invoices are key information for business processes. In order to use the data and create business value, the information must be captured in a digital and structured form. Leveraging digital tools and AI/ML is state-of-The-Art in the extraction of information from invoices. However, the existing approaches are trained on specific languages and layouts, and while focusing on the performance of individual metrics, they neglect the demonstration of the pipeline from raw data to processable information. In this paper, we investigate the types of information on invoices and address the challenges in the extraction pipeline. We contribute by providing a morphological framework for the problematization and design of a pipeline as part of a design science study.
KW - Data pipeline.
KW - Information extraction
KW - Invoice recognition
KW - Informatics
KW - Business informatics
UR - http://www.scopus.com/inward/record.url?scp=85181143950&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/aeef32be-854c-3f08-9b99-86cf2576e3ed/
U2 - 10.18420/inf2023_180
DO - 10.18420/inf2023_180
M3 - Article in conference proceedings
AN - SCOPUS:85181143950
T3 - GI-Edition: Lecture Notes in Informatics (LNI), Proceedings
SP - 1777
EP - 1792
BT - INFORMATIK 2023
A2 - Klein, Maike
A2 - Krupka, Daniel
A2 - Winter, Cornelia
A2 - Wohlgemuth, Volker
PB - Gesellschaft für Informatik e.V.
CY - Bonn
T2 - 53. Annual Meeting of the German Informatics Society (GI) - INFORMATICS 2023
Y2 - 26 September 2023 through 29 September 2023
ER -