Inside DIGI: How Our Robot Reads Documents with Superhuman Precision

DIGI is CostPocket’s document digitisation system that extracts key data from invoices, receipts, and waybills. It digitises over 700,000 documents a month across 77 countries and 52 languages with expert-level precision. Let’s take a look under the hood.
What Can DIGI Handle?
DIGI recognises a huge variety of accounting fields, from dates and totals to fuel rows and product codes:
- Issue date, due date, document number
- Subtotal, total VAT, grand total
- Bank accounts, reference numbers, card digits
- Item lines: product description, quantity, unit, VAT, net/bruto prices
- Document type & direction, order numbers, discounts, rounding
- Even fuel rows from gas station receipts
How Does It Work?
DIGI combines multiple intelligent systems:
- Our own algorithms (developed over 10+ years)
- OCR and text recognition engines
- AI and machine learning layers
- Country- and language-specific logic
- Registry data for company and invoice validation
Smart Enough to Stay Silent
If our robot isn’t sure, it doesn’t guess. Better
blank than bogus. Numerical data like totals, VAT, and item lines
go through multi-step validation where each result depends on the
others.
Data Validation
Each field is validated with AI, ML, and official
sources. Specific logic applies depending on the document type and
region. See full rules
here.
JSON Format & Precision
The output format is strict and versioned, so
your systems stay compatible.
See our precision report
or
sample JSON.
Conclusion
DIGI is not just another OCR engine. It’s a
multi-lingual, document-slicing, number-validating genius that
never tires and never guesses.
Visit
digi.costpocket.com
to try DIGI
or contact us at
[email protected].