Automated Invoice Processing: Machine Learning-Based Information Extraction for Long Tail Suppliers

Publikation: Beiträge in ZeitschriftenZeitschriftenaufsätzeForschungbegutachtet

Authors

Automation of incoming invoices processing promises to yield vast efficiency improvements in accounting. Until a universal adoption of fully electronic invoice exchange formats has been achieved, machine learning can help bridge the adoption gaps in electronic invoicing by extracting structured information from unstructured invoice formats. Machine learning especially helps the processing of invoices of suppliers who only send invoices infrequently, as the models are able to capture the semantic and visual cues of invoices and generalize them to previously unknown invoice layouts. Since the population of invoices in many companies is skewed toward a few frequent suppliers and their layouts, this research examines the effects of training data taken from such populations on the predictive quality of different machine-learning approaches for the extraction of information from invoices. Comparing the different approaches, we find that they are affected to varying degrees by skewed layout populations: The accuracy gap between in-sample and out-of-sample layouts is much higher in the Chargrid and random forest models than in the LayoutLM transformer model, which also exhibits the best overall predictive quality. To arrive at this finding, we designed and implemented a research pipeline that pays special attention to the distribution of layouts in the splitting of data and the evaluation of the models.
OriginalspracheEnglisch
Aufsatznummer200285
ZeitschriftIntelligent Systems with Applications
Jahrgang20
Anzahl der Seiten14
ISSN2667-3053
DOIs
PublikationsstatusErschienen - 01.11.2023

Bibliographische Notiz

Publisher Copyright:
© 2023 The Authors

DOI

Zuletzt angesehen

Publikationen

  1. Mirrored piezo servo hydraulic actuators for use in camless combustion engines and its Control with mirrored inputs and MPC
  2. Assessing Effects Through Semi-Field and Field Toxicity Testing
  3. Controlling a Bank Model Economy by Using an Adaptive Model Predictive Control with Help of an Extended Kalman Filter
  4. How Differences in Ratings of Odors and Odor Labels Are Associated with Identification Mechanisms
  5. Exploring the processes of emergent leadership in a netball team
  6. Determination of the construction and the material identity values of outside building components with the help of in-situ measuring procedures and FEM-simulation calculations
  7. Sliding Mode Control for a Vertical Dynamics in the Presence of Nonlinear Friction
  8. archiDART: an R package for the automated computation of plant root architectural traits
  9. Evaluating A Teaching-Learning Sequence (TLS) About Acid-Base Reactions In Upper Secondary School
  10. Energy-aware system design for autonomous wireless sensor nodes
  11. Scientific and local ecological knowledge, shaping perceptions towards protected areas and related ecosystem services
  12. Fulfillment of Heterogeneous Customer Delivery Times through Decoupling the Production and Accelerating Production Orders
  13. Using an adaptive memory strategy to improve a multistart heuristic for sequencing by hybridization
  14. Manufacturing of irregular shapes through force control in incremental sheet forming with active medium
  15. The Effect of Implicit Moral Attitudes on Managerial Decision-Making
  16. Plant traits alone are poor predictors of ecosystem properties and long-term ecosystem functioning
  17. The role of plant biodiversity in modifying the structure and functioning of higher tropic Levels in species-rich forests
  18. A Systematic Literature Review Of Machine Learning Approaches For The Prediction Of Delivery Dates
  19. Pathways and mechanisms for catalyzing social impact through Orchestration: Insights from an open social innovation project
  20. Steering of land use in the context of sustainable development
  21. Learning from Safe-by-Design for Safe-and-Sustainable-by-Design
  22. The emergence of selection rules and their encounter with group theory, 1913-1927
  23. Organizational Practices for the Aging Workforce
  24. Recognizing Guarantees and Assurances of Non-Repetition
  25. Tree-tree interactions and crown complementarity
  26. Crop rotation modelling
  27. The Use of Anti-Windup Techniques in Didactic Level Systems
  28. Analysis of life cycle datasets for the material gold
  29. Degrees of Integration
  30. Design of a Real Time Path of Motion Control for Manufacturing Applications
  31. Plural valuation in space: mapping values of grasslands and their ecosystem services
  32. Studying embodied encounters
  33. Wer wird subventioniert?
  34. Comparison of Panel Cointegration Tests
  35. Relativity in Social Cognition: Basic processes and novel applications of social comparisons
  36. A panel cointegrating rank test with structural breaks and cross-sectional dependence
  37. Levels of indicator development for education for sustainable development
  38. Co-EM Support Vector learning
  39. Smart cities, smart borders. Sensing networks and security in the urban space
  40. Scaffolding im Rahmen von Inquiry-based Learning.
  41. Managing and accounting for corporate biodiversity contributions mapping the field
  42. Factors shaping European rabbit abundance in continuous and fragmented populations of central Spain
  43. BBS futur 2.0