Automated Invoice Processing: Machine Learning-Based Information Extraction for Long Tail Suppliers

Publikation: Beiträge in ZeitschriftenZeitschriftenaufsätzeForschungbegutachtet

Authors

Automation of incoming invoices processing promises to yield vast efficiency improvements in accounting. Until a universal adoption of fully electronic invoice exchange formats has been achieved, machine learning can help bridge the adoption gaps in electronic invoicing by extracting structured information from unstructured invoice formats. Machine learning especially helps the processing of invoices of suppliers who only send invoices infrequently, as the models are able to capture the semantic and visual cues of invoices and generalize them to previously unknown invoice layouts. Since the population of invoices in many companies is skewed toward a few frequent suppliers and their layouts, this research examines the effects of training data taken from such populations on the predictive quality of different machine-learning approaches for the extraction of information from invoices. Comparing the different approaches, we find that they are affected to varying degrees by skewed layout populations: The accuracy gap between in-sample and out-of-sample layouts is much higher in the Chargrid and random forest models than in the LayoutLM transformer model, which also exhibits the best overall predictive quality. To arrive at this finding, we designed and implemented a research pipeline that pays special attention to the distribution of layouts in the splitting of data and the evaluation of the models.
OriginalspracheEnglisch
Aufsatznummer200285
ZeitschriftIntelligent Systems with Applications
Jahrgang20
Anzahl der Seiten14
ISSN2667-3053
DOIs
PublikationsstatusErschienen - 01.11.2023

Bibliographische Notiz

Publisher Copyright:
© 2023 The Authors

DOI

Zuletzt angesehen

Publikationen

  1. A Note on Estimation of Empirical Models for Margins of Exports with Unknown Non-linear Functional Forms
  2. Deconstructing the Theoretical Language of Process Research
  3. An isomorphism between polynomial eigenfunctions of the transfer operator and the Eichler cohomology for modular groups
  4. Influence of Equal-Channel Angular Pressing on the Microstructure and Texture of Mg-Zn-Y-Zr-RE Alloy Sheets
  5. Assessing Effects Through Semi-Field and Field Toxicity Testing
  6. Lost-customers approximation of semi-open queueing networks with backordering
  7. In situ synchrotron radiation diffraction investigation of the compression behaviour at 350 °C of ZK40 alloys with addition of CaO and Y
  8. Joseph Weizenbaum
  9. Neural relational inference for disaster multimedia retrieval
  10. Intellectual humility links to metacognitive ability
  11. Determinants of union membership in 18 EU countries
  12. Using an adaptive memory strategy to improve a multistart heuristic for sequencing by hybridization
  13. Graph-based Approaches for Analyzing Team Interaction on the Example of Soccer
  14. Assessing Quality of Teaching from Different Perspectives
  15. Influence of Long-Lasting Static Stretching Intervention on Functional and Morphological Parameters in the Plantar Flexors
  16. Numerical Investigation of the Effect of Rolling on the Localized Stress and Strain Induction for Wire + Arc Additive Manufactured Structures
  17. Do abundance distributions and species aggregation correctly predict macroecological biodiversity patterns in tropical forests?
  18. Using measures of reading time regularity (RTR) to quantify eye movement dynamics, and how they are shaped by linguistic information
  19. archiDART: an R package for the automated computation of plant root architectural traits
  20. Investigating the Promotional Effect of Green Signals in Sponsored Search Advertising Using Bayesian Parameter Estimation
  21. Development of Early Spatial Perspective-Taking - Toward a Three-Level Model
  22. Does symbolic representation through class signalling appeal to voters? Evidence from a conjoint experiment
  23. Supporting non-hierarchical supply chain networks in the electronics industry
  24. Sustainable use of ecosystem services under multiple risks
  25. Geometric control tools for robotic manipulators
  26. Tree phylogenetic diversity promotes host–parasitoid interactions
  27. Net deferred tax assets and the long-run performance of initial public offerings
  28. Gluing life together. Computer simulation in the life sciences
  29. De-Anonymizing Anonymous
  30. Fermentative utilization of coffee mucilage using Bacillus coagulans and investigation of down-stream processing of fermentation broth for optically pure L(+)-lactic acid production
  31. Eulerian and Lagrangian perspectives on turbulent superstructures in Rayleigh-Bénard convection
  32. Disassembly and reassembly
  33. The Effect of Implicit Moral Attitudes on Managerial Decision-Making
  34. Introduction
  35. Creativity in the ‘spaces of hope’
  36. The role of plant biodiversity in modifying the structure and functioning of higher tropic Levels in species-rich forests
  37. TACKLING THE GLOBAL WASTE PROBLEM AS A MULTI-LEVEL PROCESS
  38. Tree diversity promotes functional dissimilarity and maintains functional richness despite species loss in predator assemblages
  39. Pathways and mechanisms for catalyzing social impact through Orchestration: Insights from an open social innovation project
  40. Steering of land use in the context of sustainable development
  41. Explicit Apologies in Fictional Telecinematic Discourse
  42. Inquiry-based Learning Environments to Welcome the Diversity of a Chemistry Class