Extract Document | Automat

Extract structured data from a document using a configured extractor.

Upload a document (PDF, image, or other supported format) along with an extractor ID, and receive the extracted data matching your extractor’s schema.

File Input Options:

Binary upload: Directly upload the file (max 4.5MB due to Vercel limits)
URL: Pass a publicly accessible URL to the document
Base64: Pass base64-encoded file content
Data URL: Pass a data URL (e.g., data:application/pdf;base64,...)

For files larger than 4.5MB, use a URL or base64 input.

Extract structured data from a document using a configured extractor. Upload a document (PDF, image, or other supported format) along with an extractor ID, and receive the extracted data matching your extractor's schema. **File Input Options:** - **Binary upload**: Directly upload the file (max 4.5MB due to Vercel limits) - **URL**: Pass a publicly accessible URL to the document - **Base64**: Pass base64-encoded file content - **Data URL**: Pass a data URL (e.g., `data:application/pdf;base64,...`) For files larger than 4.5MB, use a URL or base64 input.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Request

This endpoint expects a multipart form containing a file.

extractorIdstringRequired

The unique identifier of the extractor to use for extraction. You can find this in the Automat dashboard or when creating an extractor via the API.

filefileRequired

The document to extract data from. Can be:

A binary file upload
A URL string (http:// or https://)
A base64-encoded string
A data URL string

The document to extract data from. Can be: - A binary file upload - A URL string (http:// or https://) - A base64-encoded string - A data URL string

mimeTypestringOptional

The MIME type of the file. Required when file is a string (URL, base64, or data URL). Examples: application/pdf, image/png, image/jpeg

filenamestringOptional

Optional filename for the document. If not provided, will be inferred from the File object name or URL path.

Response

This endpoint returns an object.

successboolean

Whether the extraction was successful

datamap from strings to any or null

The extracted data matching your extractor's schema. The structure depends on how you configured your extractor.

1	import requests
2
3	url = "https://inference.runautomat.com/api/extract"
4
5	files = { "file": "open('<file1>', 'rb')" }
6	payload = {
7	"extractorId": "ext_abc123",
8	"mimeType": ,
9	"filename":
10	}
11	headers = {"Authorization": "Bearer <apiKey>"}
12
13	response = requests.post(url, data=payload, files=files, headers=headers)
14
15	print(response.json())

1	{
2	"success": true,
3	"data": {
4	"invoice_number": "INV-2024-001",
5	"date": "2024-01-15",
6	"total_amount": 1250,
7	"vendor_name": "Acme Corp",
8	"line_items": [
9	{
10	"description": "Widget A",
11	"quantity": 10,
12	"unit_price": 100
13	},
14	{
15	"description": "Widget B",
16	"quantity": 5,
17	"unit_price": 50
18	}
19	]
20	}
21	}

Authentication

Request

Response

Errors