Processing Pipeline

Follow the document path from browser validation to parser work, structured extraction, and reviewable accounting facts.

The processing pipeline keeps uploads responsive while preventing uncontrolled parser and AI fan-out.

Pipeline

  1. Browser validates file type and size.
  2. Browser hashes files with bounded concurrency.
  3. Convex preflight detects duplicate content and same-name conflicts.
  4. Convex creates an upload session and issues a signed storage upload URL.
  5. Browser uploads bytes and finalizes the upload.
  6. Convex stores accepted file metadata and creates a pending parse job.
  7. The dispatcher claims due work under organization-level concurrency limits.
  8. Parser service extracts text and page metadata.
  9. Convex creates a structured extraction job.
  10. AI extraction produces structured accounting facts.
  11. The document becomes reviewable in the Data Room.

Queue policy

JobOrg concurrencyRetry style
Parse2Transient failures retry with backoff.
Structured extraction1AI work is bounded per organization.
LLM chat jobs2Interactive work has separate capacity.

Manual retries and manual extraction requests should enqueue processing jobs instead of scheduling workers directly.