Document Recognition
(OCR, KIE, e-Forms)

Straight facts

This page explains how to build document recognition that actually works—OCR + Key Information Extraction (KIE), tables, barcodes/QR, signatures/stamps—measurable, secure, and audit-ready.

📄 What it is

🔍 OCR

Turn images/PDFs into text

🎯 KIE (Key Information Extraction)

Find fields (invoice no., dates, totals)

📊 Layout & tables

Detect forms, line items, grids

✅ Validation

Rules/dictionaries/check digits/digital signatures

🔗 Events/APIs

Deliver structured JSON to ERP/CRM/DB

🎯 Typical use cases

📋 Invoices/Receipts/e-Tax

Header + line items

🆔 ID/Passport/Driver's license

Face/MRZ/AAMVA PDF417/QR

🏦 Bank slips & EMVCo QR

Decode → verify with bank API → reconcile

🚛 Logistics docs

B/L, packing list, container/chassis forms

👥 HR/Compliance

Forms, certs, NDAs

🔧 Maintenance/QA checklists

Handwriting, stamps

📷 Capture that actually works

🖨️ Scanners

  • • 300–400 dpi, de-skew, duplex
  • • PDF/A when needed

📱 Mobiles

  • • Flat, no glare/shadow
  • • Fill the frame, keystone ≤ 10–15°
  • • Use auto-dewarp

📸 Cameras

  • • 1/125–1/250 s for hand-held
  • • Even light; polarizer for glossy docs

📁 Files

  • • Prefer PDF with embedded text
  • • Keep originals for audit

ประสิทธิภาพ

  • ความแม่นยำ: 90-98%
  • รองรับหลายภาษา
  • ประมวลผลเร็ว
  • รองรับรูปแบบไฟล์หลากหลาย

🔄 Pipeline (practical)

🔧 Processing Pipeline

classify doc type detect layout OCR KIE validation normalization events & storage

1-4. Early Processing

  • Classify doc type (invoice/ID/slip/etc.)
  • Detect layout: blocks, tables, key zones
  • OCR: Thai/Latin, numeric, dates
  • KIE: regex + ML (layout-aware)

5-7. Post Processing

  • Validation: math, formats, signatures
  • Normalization: currency, dates, address
  • Events & storage: JSON + evidence

📈 Metrics that matter

📊 Core Metrics

  • OCR: CER/WER (character/word error rate)
  • Fields: Precision/Recall/F1 per field
  • Exact-Match rate for key fields
  • Tables: cell/line-item extraction accuracy

⚡ Performance & Business

  • Latency & throughput: end-to-end per page
  • Pages/hour processing capacity
  • Business KPIs: reconciliation time saved
  • Exceptions per 1k docs

🛡️ Anti-fraud & authenticity

📱 QR/Barcode Verification

  • • Decode EMVCo/2D → verify with issuer/bank API
  • • Compare amount/date/reference

📄 PDF signatures

  • • Validate X.509 chain
  • • Hash verification and revocation check

👁️ Visual tamper cues

  • • Font inconsistencies
  • • Copy-move seams, low-quality reprints

🔍 Cross-checks

  • • Totals vs line items, VAT math
  • • Supplier IDs, allow/deny lists

🔒 Privacy & compliance (PDPA)

⚠️ Sensitive Data: IDs, faces, account numbers are sensitive information requiring special handling.

🔐 Minimize & mask

  • • Store only needed fields
  • • Keep hashes when possible
  • • Redact PII in exports

📅 Retention

  • • Raw images: 30–90 days
  • • Structured data: as required by law

🛡️ Security

  • • Encryption in transit/at rest
  • • RBAC/MFA; full access logs
  • • DPIA before rollout

🚀 Deployment patterns

🏭 Edge/on-prem

  • • IDs, bank slips, compliance-sensitive docs
  • • Low latency, offline-capable

🖥️ Server/cluster

  • • Heavy volume, multi-site
  • • HA, encryption, audit

🔄 Hybrid

  • • Edge pre-process
  • • Central validation
  • • Cloud reporting

📋 API (example)

📄 Document Processing Result

Example JSON response from processing a Thai tax invoice with line items, QR code verification, and field validation.

📄 JSON Response Structure

{
  "doc_type": "invoice",
  "confidence": 0.97,
  "fields": {
    "invoice_no": {
      "value": "INV-2025-0173",
      "conf": 0.98,
      "bbox": [412, 96, 220, 28]
    },
    "date_iso": {"value": "2025-08-25", "conf": 0.95},
    "supplier_tax_id": {
      "value": "0105551234567",
      "conf": 0.94,
      "validated": true
    },
    "subtotal": {
      "value": 125000.00,
      "currency": "THB",
      "conf": 0.99
    },
    "vat_amount": {
      "value": 8750.00,
      "conf": 0.99,
      "checked_math": true
    },
    "total": {"value": 133750.00, "conf": 0.99}
  },
  "tables": [{
    "name": "line_items",
    "rows": [{
      "desc": "Bearing 6204",
      "qty": 100,
      "uom": "pcs",
      "price": 1250.00,
      "amount": 125000.00
    }]
  }],
  "barcodes": [{
    "type": "QR",
    "data": "...",
    "verified": true
  }],
  "evidence": {
    "page": 1,
    "crops": {
      "invoice_no": "...",
      "total": "..."
    }
  }
}

🚨 Red flags

❌ Marketing Red Flags

  • • "100% accuracy OCR"
  • • Model-only FPS claims
  • • No field/table metrics

⚠️ Technical Red Flags

  • • Stores full images indefinitely
  • • No PDPA controls/logs
  • • No validation (math/QR/signature)
  • • No business rules

🔗 GaugeSnap integration

🎯 Comprehensive Document Processing

Edge-first OCR/KIE (Thai+English), table extraction, stamp/signature detection combined with industrial sensor data for complete audit trails.

🏦 Financial Verification

  • • Bank-slip verification: decode EMVCo QR
  • • Verify via bank API (OAuth2)
  • • Reconcile amount/date/reference; fraud detection

🏭 Industrial Documents

  • • Work orders, QA forms, container/yard papers
  • • Pair with ANPR/Container ID events
  • • Complete chain of custody tracking

📊 APIs & Dashboards

  • • REST/MQTT integration
  • • Field-level F1, table accuracy, latency metrics
  • • PDPA-ready retention & audit logs

🚀 How to start (low-risk)

1. Pick one doc type

Define the field schema (names, regex, units)

2. Provide samples

100–300 samples (scans + mobile; good/bad cases)

3. Get baseline report

CER/WER, field F1, table accuracy, latency + pilot with validation rules and PDPA-ready storage

💡 Principle: Prove with field-level metrics and reconciled totals—before you scale.