Document Recognition
(OCR, KIE, e-Forms)
Straight facts
This page explains how to build document recognition that actually works—OCR + Key Information Extraction (KIE), tables, barcodes/QR, signatures/stamps—measurable, secure, and audit-ready.
📄 What it is
🔍 OCR
Turn images/PDFs into text
🎯 KIE (Key Information Extraction)
Find fields (invoice no., dates, totals)
📊 Layout & tables
Detect forms, line items, grids
✅ Validation
Rules/dictionaries/check digits/digital signatures
🔗 Events/APIs
Deliver structured JSON to ERP/CRM/DB
🎯 Typical use cases
📋 Invoices/Receipts/e-Tax
Header + line items
🆔 ID/Passport/Driver's license
Face/MRZ/AAMVA PDF417/QR
🏦 Bank slips & EMVCo QR
Decode → verify with bank API → reconcile
🚛 Logistics docs
B/L, packing list, container/chassis forms
👥 HR/Compliance
Forms, certs, NDAs
🔧 Maintenance/QA checklists
Handwriting, stamps
📷 Capture that actually works
🖨️ Scanners
- • 300–400 dpi, de-skew, duplex
- • PDF/A when needed
📱 Mobiles
- • Flat, no glare/shadow
- • Fill the frame, keystone ≤ 10–15°
- • Use auto-dewarp
📸 Cameras
- • 1/125–1/250 s for hand-held
- • Even light; polarizer for glossy docs
📁 Files
- • Prefer PDF with embedded text
- • Keep originals for audit
ประสิทธิภาพ
- ความแม่นยำ: 90-98%
- รองรับหลายภาษา
- ประมวลผลเร็ว
- รองรับรูปแบบไฟล์หลากหลาย
🔄 Pipeline (practical)
🔧 Processing Pipeline
1-4. Early Processing
- • Classify doc type (invoice/ID/slip/etc.)
- • Detect layout: blocks, tables, key zones
- • OCR: Thai/Latin, numeric, dates
- • KIE: regex + ML (layout-aware)
5-7. Post Processing
- • Validation: math, formats, signatures
- • Normalization: currency, dates, address
- • Events & storage: JSON + evidence
📈 Metrics that matter
📊 Core Metrics
- • OCR: CER/WER (character/word error rate)
- • Fields: Precision/Recall/F1 per field
- • Exact-Match rate for key fields
- • Tables: cell/line-item extraction accuracy
⚡ Performance & Business
- • Latency & throughput: end-to-end per page
- • Pages/hour processing capacity
- • Business KPIs: reconciliation time saved
- • Exceptions per 1k docs
🛡️ Anti-fraud & authenticity
📱 QR/Barcode Verification
- • Decode EMVCo/2D → verify with issuer/bank API
- • Compare amount/date/reference
📄 PDF signatures
- • Validate X.509 chain
- • Hash verification and revocation check
👁️ Visual tamper cues
- • Font inconsistencies
- • Copy-move seams, low-quality reprints
🔍 Cross-checks
- • Totals vs line items, VAT math
- • Supplier IDs, allow/deny lists
🔒 Privacy & compliance (PDPA)
🔐 Minimize & mask
- • Store only needed fields
- • Keep hashes when possible
- • Redact PII in exports
📅 Retention
- • Raw images: 30–90 days
- • Structured data: as required by law
🛡️ Security
- • Encryption in transit/at rest
- • RBAC/MFA; full access logs
- • DPIA before rollout
🚀 Deployment patterns
🏭 Edge/on-prem
- • IDs, bank slips, compliance-sensitive docs
- • Low latency, offline-capable
🖥️ Server/cluster
- • Heavy volume, multi-site
- • HA, encryption, audit
🔄 Hybrid
- • Edge pre-process
- • Central validation
- • Cloud reporting
📋 API (example)
📄 Document Processing Result
Example JSON response from processing a Thai tax invoice with line items, QR code verification, and field validation.
📄 JSON Response Structure
{
"doc_type": "invoice",
"confidence": 0.97,
"fields": {
"invoice_no": {
"value": "INV-2025-0173",
"conf": 0.98,
"bbox": [412, 96, 220, 28]
},
"date_iso": {"value": "2025-08-25", "conf": 0.95},
"supplier_tax_id": {
"value": "0105551234567",
"conf": 0.94,
"validated": true
},
"subtotal": {
"value": 125000.00,
"currency": "THB",
"conf": 0.99
},
"vat_amount": {
"value": 8750.00,
"conf": 0.99,
"checked_math": true
},
"total": {"value": 133750.00, "conf": 0.99}
},
"tables": [{
"name": "line_items",
"rows": [{
"desc": "Bearing 6204",
"qty": 100,
"uom": "pcs",
"price": 1250.00,
"amount": 125000.00
}]
}],
"barcodes": [{
"type": "QR",
"data": "...",
"verified": true
}],
"evidence": {
"page": 1,
"crops": {
"invoice_no": "...",
"total": "..."
}
}
}
🚨 Red flags
❌ Marketing Red Flags
- • "100% accuracy OCR"
- • Model-only FPS claims
- • No field/table metrics
⚠️ Technical Red Flags
- • Stores full images indefinitely
- • No PDPA controls/logs
- • No validation (math/QR/signature)
- • No business rules
🔗 GaugeSnap integration
🎯 Comprehensive Document Processing
Edge-first OCR/KIE (Thai+English), table extraction, stamp/signature detection combined with industrial sensor data for complete audit trails.
🏦 Financial Verification
- • Bank-slip verification: decode EMVCo QR
- • Verify via bank API (OAuth2)
- • Reconcile amount/date/reference; fraud detection
🏭 Industrial Documents
- • Work orders, QA forms, container/yard papers
- • Pair with ANPR/Container ID events
- • Complete chain of custody tracking
📊 APIs & Dashboards
- • REST/MQTT integration
- • Field-level F1, table accuracy, latency metrics
- • PDPA-ready retention & audit logs
🚀 How to start (low-risk)
1. Pick one doc type
Define the field schema (names, regex, units)
2. Provide samples
100–300 samples (scans + mobile; good/bad cases)
3. Get baseline report
CER/WER, field F1, table accuracy, latency + pilot with validation rules and PDPA-ready storage