Parser Service
Understand the isolated parser service that extracts text and page metadata from source files.
The parser service is a separate authenticated service for document parsing and provider-specific LLM proxy paths that cannot run directly inside Convex.
It is stateless and protected by a shared secret.
Responsibilities
| Responsibility | Detail |
|---|---|
| Download source files | Only from configured allowed hosts. |
| Enforce bounds | Max file bytes, max pages, OCR settings, and parser concurrency. |
| Extract text | Parse PDF and supported document formats into text and page metadata. |
| OCR when enabled | Use selective OCR for scanned pages when configured. |
| Return structured errors | Preserve enough failure detail for retry and user messaging. |
Failure handling
Transient infrastructure failures can be retried through the processing queue. Invalid files, unsupported types, missing documents, and access errors should fail fast.