How a Leading Energy Provider Structured 10,000+ Contracts for Microsoft Dynamics—Without OCR Templates or Manual Work

A German utilities giant needed to populate Microsoft Dynamics with structured data from thousands of unstructured contracts, ranging from scans to digital documents. Previous OCR and AI tools failed. Talonic delivered a schema-driven, AI-validated solution—built for scale, auditability, and CRM integration.

150+ field schema designed for commercial, legal, and regulatory data

Jointly built with the client’s contract team. Structured everything from SLAs to extended rights.

Extracted from PDFs, poor scans, annexes, and decades of formats

No templates. AI understands layout, meaning, and cross-document relationships.

Validated with confidence scoring and GUI approval

No templates. AI understands layout, meaning, and cross-document relationships.

Request a Demo
Read More

Why This Problem Was Considered "Impossible"

Earlier tools couldn’t manage the client’s large, varied contract library. With tens of thousands of old documents, they were stuck between costly manual extraction and abandoning digitization.

Document quality and layout

Contracts are in every possible file format,
Poor quality: scans of scans, broken text, handritten sections,
OCR tools are unable to provide reliable results.

Phrasing inconsistency

Legal and commercial terms were described in different ways
OCR and rule-based NLP failed to identify clauses reliably
OCR, RPA, and even large-language-model-based chatbots could not consistently extract or validate data fields

Manual effort is unscalable

Processing just the active contracts (~5,000) would have taken over 10,000 hours of expert review
Manual workload would result in making the full digitization financially unviable

Strategic Pressure Was Mounting

Talonic UI for Data Structuring screenshot of birth certificates structured together

This wasn’t just an operational pain point—it was a strategic blocker.

The company’s digitalization roadmap depended on full CRM visibility.
Regulatory scrutiny demanded traceability of pricing terms, renewal periods, and compliance clauses.
Fragmented contract data meant revenue leakage, missed obligations, and internal inefficiencies across procurement, legal, and commercial teams.

The goal was clear: put every relevant contract field—past and future—into Microsoft Dynamics. But nothing in the market could do it.

How We Solved It — A Schema-Driven, End-to-End AI Workflow

We didn’t start with templates, keywords, or generic AI—we started with the company’s data model. If the CRM needs structured, validated fields, the AI must follow that schema. So we defined what to extract, why it mattered, and how to validate it across wildly different contract formats.

Schema Design With the Client

In collaboration with the Head of Contract Management and their team, we created a 150+ field data schema—covering durations, pricing, SLAs, obligations, rights, contract status, and more. Each field had a clear name, definition, and example value, aligned to Dynamics CRM.

01

Context-Aware AI Structuring

Our AI pipeline began with layout-preserving OCR to clean up poor scans. Then, our AI Structuring Engine processed the document schema-aware: extracting values based on meaning, not position or formatting. It could interpret clause logic, infer contract status, and consolidate cross-clause data (e.g., obligations).

02

Validation & Human Approval

We ran multi-shot validation (2- and 3-shot runs) to identify confidence levels for each field. Only low-confidence fields were flagged. Reviewers used a GUI with color-coded indicators (green to red) to approve batches. No manual rework—just targeted validation.

03

Request a Demo

10k+ Dynamics Contracts Automated—No OCR Templates