Extract Document

Extract structured data from a document using a configured extractor. Upload a document (PDF, image, or other supported format) along with an extractor ID, and receive the extracted data matching your extractor's schema. **File Input Options:** - **Binary upload**: Directly upload the file (max 4.5MB due to Vercel limits) - **URL**: Pass a publicly accessible URL to the document - **Base64**: Pass base64-encoded file content - **Data URL**: Pass a data URL (e.g., `data:application/pdf;base64,...`) For files larger than 4.5MB, use a URL or base64 input.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Request

This endpoint expects a multipart form containing a file.
extractorIdstringRequired
The unique identifier of the extractor to use for extraction. You can find this in the Automat dashboard or when creating an extractor via the API.
filefileRequired
The document to extract data from. Can be: - A binary file upload - A URL string (http:// or https://) - A base64-encoded string - A data URL string
mimeTypestringOptional

The MIME type of the file. Required when file is a string (URL, base64, or data URL). Examples: application/pdf, image/png, image/jpeg

filenamestringOptional
Optional filename for the document. If not provided, will be inferred from the File object name or URL path.

Response

This endpoint returns an object.
successboolean
Whether the extraction was successful
datamap from strings to any or null
The extracted data matching your extractor's schema. The structure depends on how you configured your extractor.

Errors