Extract Document
Extract structured data from a document using a configured extractor.
Upload a document (PDF, image, or other supported format) along with an extractor ID,
and receive the extracted data matching your extractor's schema.
**File Input Options:**
- **Binary upload**: Directly upload the file (max 4.5MB due to Vercel limits)
- **URL**: Pass a publicly accessible URL to the document
- **Base64**: Pass base64-encoded file content
- **Data URL**: Pass a data URL (e.g., `data:application/pdf;base64,...`)
For files larger than 4.5MB, use a URL or base64 input.
Authentication
AuthorizationBearer
Bearer authentication of the form Bearer <token>, where token is your auth token.
Request
This endpoint expects a multipart form containing a file.
extractorId
The unique identifier of the extractor to use for extraction.
You can find this in the Automat dashboard or when creating an extractor via the API.
file
The document to extract data from. Can be:
- A binary file upload
- A URL string (http:// or https://)
- A base64-encoded string
- A data URL string
mimeType
The MIME type of the file. Required when file is a string (URL, base64, or data URL).
Examples: application/pdf, image/png, image/jpeg
filename
Optional filename for the document. If not provided, will be inferred from
the File object name or URL path.
Response
This endpoint returns an object.
success
Whether the extraction was successful
data
The extracted data matching your extractor's schema.
The structure depends on how you configured your extractor.