Processing Pipeline
Follow the document path from browser validation to parser work, structured extraction, and reviewable accounting facts.
The processing pipeline keeps uploads responsive while preventing uncontrolled parser and AI fan-out.
Pipeline
- Browser validates file type and size.
- Browser hashes files with bounded concurrency.
- Convex preflight detects duplicate content and same-name conflicts.
- Convex creates an upload session and issues a signed storage upload URL.
- Browser uploads bytes and finalizes the upload.
- Convex stores accepted file metadata and creates a pending parse job.
- The dispatcher claims due work under organization-level concurrency limits.
- Parser service extracts text and page metadata.
- Convex creates a structured extraction job.
- AI extraction produces structured accounting facts.
- The document becomes reviewable in the Data Room.
Queue policy
| Job | Org concurrency | Retry style |
|---|---|---|
| Parse | 2 | Transient failures retry with backoff. |
| Structured extraction | 1 | AI work is bounded per organization. |
| LLM chat jobs | 2 | Interactive work has separate capacity. |
Manual retries and manual extraction requests should enqueue processing jobs instead of scheduling workers directly.