Information Extraction from Invoices: A Graph Neural Network Approach for Datasets with High Layout Variety

Felix Krieger; Paul Drews; Burkhardt Funk; Till Wobbe

doi:10.1007/978-3-030-86797-3_1

Information Extraction from Invoices: A Graph Neural Network Approach for Datasets with High Layout Variety

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Authors

Felix Krieger
Paul Drews
Burkhardt Funk
Till Wobbe

Professorship for Information Systems, in particular Digital Transformation and Information Management
Professorship for Information Systems, in particular Data Science

Extracting information from invoices is a highly structured, recurrent task in auditing. Automating this task would yield efficiency improvements, while simultaneously improving audit quality. The challenge for this endeavor is to account for the text layout on invoices and the high variety of layouts across different issuers. Recent research has proposed graphs to structurally represent the layout on invoices and to apply graph convolutional networks to extract the information pieces of interest. However, the effectiveness of graph-based approaches has so far been shown only on datasets with a low variety of invoice layouts. In this paper, we introduce a graph-based approach to information extraction from invoices and apply it to a dataset of invoices from multiple vendors. We show that our proposed model extracts the specified key items from a highly diverse set of invoices with a macro F ₁ score of 0.8753.

Original language	English
Title of host publication	Innovation Through Information Systems - Volume II : A Collection of Latest Research on Technology Issues
Editors	Frederik Ahlemann, Reinhard Schütte, Stefan Stieglitz
Number of pages	16
Place of Publication	Cham
Publisher	Springer Science and Business Media Deutschland
Publication date	2021
Pages	5-20
ISBN (print)	978-3-030-86796-6
ISBN (electronic)	978-3-030-86797-3
DOIs	https://doi.org/10.1007/978-3-030-86797-3_1
Publication status	Published - 2021

Bibliographical note

Publisher Copyright:
© 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Research areas

Audit digitization, Graph attention networks, Graph-based machine learning, Unstructured data
Informatics
Business informatics

Other publications by the same author(s)

AI-Enhanced Literature Reviews: Connecting Emerging Phenomena and Bodies of Knowledge

Naqvi, S. A. A., Zimmer, M. P., Kauschinger, M., Drews, P. & Basole, R. C., 2026, Proceedings of the 59th Hawaii International Conference on System Sciences: Hyatt Regency Maui, January 6-9, 2026. Bui, T. X. (ed.). Honolulu: University of Hawaii at Manoa, p. 7233-7242 10 p. (Proceedings of the ... Annual Hawaii International Conference on System Sciences; vol. 59).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

IT Outsourcing Relationships in the Context of Digital Transformation: From Transactional Relationships to Ecosystem Partnerships

Gratzke, L., Zimmer, M. P. & Drews, P., 2026, In: Information Systems Management. 43, 1, p. 65-84 20 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Mapping the Skills and Roles of Experimentation in Software Organizations: Evidence from 1,800 Job Postings

Stotz, N., Anderson, K. & Drews, P., 2026, (Accepted/In press) Proceedings of CHASE 2026. 11 p.

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Aligning Experimentation with Product Operations: A Taxonomy for Structuring Experimentation Teams

Stotz, N., Labay, B., Vermeer, L. & Drews, P., 2026, Software Engineering and Advanced Applications: 51st Euromicro Conference, SEAA 2025 Salerno, Italy, September 10–12, 2025; Proceedings, Part III. Taibi, D. & Smite, D. (eds.). Cham: Springer Nature Switzerland AG, Vol. 3. p. 23-38 16 p. (Lecture Notes in Computer Science; vol. 16083).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Capitalizing on natural language processing (NLP) to automate the evaluation of coach implementation fidelity in guided digital cognitive-behavioral therapy (GdCBT)

Zainal, N. H., Eckhardt, R., Rackoff, G. N., Fitzsimmons-Craft, E. E., Rojas-Ashe, E., Barr Taylor, C., Funk, B., Eisenberg, D., Wilfley, D. E. & Newman, M. G., 02.04.2025, In: Psychological Medicine. 55, e106.

Research output: Journal contributions › Journal articles › Research › peer-review

DOI

https://doi.org/10.1007/978-3-030-86797-3_1
Final published version

Information Extraction from Invoices: A Graph Neural Network Approach for Datasets with High Layout Variety

Authors

Bibliographical note

Research areas

Other publications by the same author(s)

AI-Enhanced Literature Reviews: Connecting Emerging Phenomena and Bodies of Knowledge

IT Outsourcing Relationships in the Context of Digital Transformation: From Transactional Relationships to Ecosystem Partnerships

Mapping the Skills and Roles of Experimentation in Software Organizations: Evidence from 1,800 Job Postings

Aligning Experimentation with Product Operations: A Taxonomy for Structuring Experimentation Teams

Capitalizing on natural language processing (NLP) to automate the evaluation of coach implementation fidelity in guided digital cognitive-behavioral therapy (GdCBT)

Links

DOI

Recently viewed

Activities