28 apr

Document Information Extraction – Workflow


E-invoicing in Germany: How to implement the obligation with SAP Business One

Document Information Extraction (DIE, deutsch: Dokumenten-Informationsextraktion) refers to the workflow, to automatically extract structured data from unstructured documents — primarily incoming invoices, delivery notes, and contracts. The goal is for a PDF or image file in SAP Business One to not only be archived but also directly populate the fields of an incoming invoice: supplier, document number, document date, tax amount, net sum, document lines, order reference.

Context

A typical workflow runs in three steps. (1) RecordingSupporting documents arrive via scan, email (using Outlook plugins), SFTP download, or upload by the case worker. ExtractionAn OCR service reads the text, a layout model or LLM pipeline semantically extracts the mandatory fields; for ZUGFeRDFor Factur-X/XRechnungen, the embedded XML payload is used directly, eliminating the need for OCR. BookingThe extracted data will be against SAP B1 business partners, orders, tax code mapped; in the event of discrepancies between the order and the invoice, a review workflow will be initiated. From a product perspective, several options are available for SAP B1: SAP Document Information Extraction as BTP Service, SAP Document and reporting Compliance for e-invoicing, third-party products such as CKS.DIGITAL 4.0 with integrated OCR recognition (which extracts keywords and assigns documents via reference fields), as well as AI-based products such as the B1-Helpster with FIBU-Helper component, which provide account assignment suggestions based on extracted fields.

Demarcation

Document Information Extraction is more than classic OCR: it delivers structured fields, not just raw text. It is also not identical to e-invoicing reception – ZUGFeRD and XRechnung documents are processed directly from the XML payload, without extraction from an image. Compared to a pure document archive (CKS.DMS, d.velop), DIE focuses on the path from document to booking confirmation; the pure archiving is a separate, supplementary step. Workflow quality depends heavily on data models, supplier variance, and approval processes — a 95%%automation level is realistically achievable, but never a given.


AI in the company

Why companies are hesitant about AI in ERP

Artificial intelligence in the ERP context raises high expectations, as significant productivity gains, far-reaching automation and more informed decisions are on the cards. Nevertheless ...
Predictive maintenance

Predictive maintenance: how to turn SMEs into smart factories

In today's intelligent world, the ability to solve problems before they even arise is no longer a futuristic scenario, but ...
RPA

RPA in the ERP environment: increasing efficiency through digital process assistants

Many ERP systems run processes on a daily basis that are necessary but do not add value. Employees spend valuable time processing orders ...
Generative AI in ERP

Generative AI in ERP: How LLMs are changing the role of ERP systems

With the advent of generative AI and large language models (LLMs), the role of ERP systems is changing fundamentally. Instead of ...
ERP FUTURE

Preparing the ERP future with APIs and microservices

Many medium-sized companies are still working with ERP monoliths that have grown over the years. The modules of these systems are closely ...
DATA-QUALITY

Data quality & AI : AI can only be as good as your data

Companies today are investing heavily in AI technologies, intelligent automation and modern ERP architectures. Despite this, many modernisation projects fail in the early stages because ...
Wird geladen …