YOUR DOCUMENTS. YOUR RULES.
Open-source, local-first, AI-powered document intelligence. Extract, organize, and archive invoices, receipts, and contracts — 100% on your machine.
View on GitHub →# Clone & install $ git clone https://github.com/astonysh/DocuClaw.git $ cd DocuClaw && pip install -e . # Process a document $ docuclaw process \ --entity-id "org_mycompany_01" \ --country DE \ --input ./scans/invoice.png
All data stays on YOUR machine. Zero cloud dependency. Zero telemetry. Your privacy is non-negotiable.
Manage personal docs, company invoices, and team files — all in one install. Separate or combine as you wish.
Country-specific parsers snap in like LEGO bricks. Germany, US, China — extend DocuClaw for any locale.
Every document becomes a searchable .md file with structured YAML frontmatter. Human-readable, version-controllable.
Multimodal LLM extracts structured data from scans, photos, and emails. Works with Ollama, OpenAI, or any model.
Designed with GoBD (Germany), GDPR, and audit-trail principles baked in. Enterprise-grade from day one.
┌─────────────────────────────────────────────┐ │ CLI / API │ ├─────────────────────────────────────────────┤ │ Core Engine │ │ ┌──────────┐ ┌──────────┐ ┌───────────┐ │ │ │ Schema │ │ Storage │ │ Registry │ │ │ │(Pydantic) │ │ Layer │ │ (Plugin) │ │ │ └──────────┘ └──────────┘ └───────────┘ │ ├─────────────────────────────────────────────┤ │ Parser Plugins │ │ ┌────────┐ ┌────────┐ ┌──────────────┐ │ │ │ DE 🇩🇪 │ │ US 🇺🇸 │ │ Custom ... │ │ │ │Invoice │ │Invoice │ │ Your Parser │ │ │ └────────┘ └────────┘ └──────────────┘ │ ├─────────────────────────────────────────────┤ │ Input Adapters (Future) │ │ 📷 Scanner │ 📧 Email │ 🔗 Webhook │ 🔌 API │ └─────────────────────────────────────────────┘
Every document, whether a €10K enterprise invoice or a personal electricity bill, is normalized into a universal Markdown schema with structured YAML frontmatter.
--- id: doc_20260215_a1b2c3d4 entity_id: "org_acme_01" entity_type: "company" source_type: physical_mail country: DE document_type: b2b_invoice date_received: "2026-02-15" sender_name: "AWS EMEA SARL" amount_total: 125.50 currency: EUR status: pending tags: [IT_Infrastructure, Q1_Expense] ---