Quickstart

Extract data from your first document in under 5 minutes

Prerequisites

Before you begin, you’ll need:

Step 1: Create an Extractor

1

Go to Extractors

Navigate to your project in the Automat Dashboard and click Create Extractor

2

Define Your Schema

Use the visual editor to define what fields you want to extract. For example, for an invoice:

1{
2 "invoice_number": "string",
3 "date": "string",
4 "total_amount": "number",
5 "vendor_name": "string",
6 "line_items": [{
7 "description": "string",
8 "quantity": "number",
9 "unit_price": "number"
10 }]
11}
3

Test in Playground

Upload a sample document to test your extractor before using the API

4

Copy Extractor ID

Once satisfied, copy the extractor ID (e.g., ext_abc123) for API use

Step 2: Make Your First API Request

Using cURL

$curl -X POST https://inference.runautomat.com/api/extract \
> -H "Authorization: Bearer YOUR_API_KEY" \
> -F "extractorId=YOUR_EXTRACTOR_ID" \
> -F "file=@path/to/your/document.pdf"

Using the TypeScript SDK

$npm install @automat/sdk
1import { AutomatClient } from '@automat/sdk';
2import fs from 'fs';
3
4const client = new AutomatClient({
5 apiKey: process.env.AUTOMAT_API_KEY,
6});
7
8async function extractDocument() {
9 const result = await client.extract({
10 extractorId: 'ext_abc123',
11 file: fs.createReadStream('invoice.pdf'),
12 });
13
14 if (result.success) {
15 console.log('Extracted data:', result.data);
16 } else {
17 console.error('Extraction failed:', result.error);
18 }
19}
20
21extractDocument();

Using the Python SDK

$pip install automat-sdk
1from automat import AutomatClient
2import os
3
4client = AutomatClient(api_key=os.environ["AUTOMAT_API_KEY"])
5
6with open("invoice.pdf", "rb") as f:
7 result = client.extract(
8 extractor_id="ext_abc123",
9 file=f,
10 )
11
12if result.success:
13 print("Extracted data:", result.data)
14else:
15 print("Extraction failed:", result.error)

Step 3: Handle the Response

A successful response looks like:

1{
2 "success": true,
3 "data": {
4 "invoice_number": "INV-2024-001",
5 "date": "2024-01-15",
6 "total_amount": 1250.00,
7 "vendor_name": "Acme Corp",
8 "line_items": [
9 {
10 "description": "Widget A",
11 "quantity": 10,
12 "unit_price": 100.00
13 },
14 {
15 "description": "Widget B",
16 "quantity": 5,
17 "unit_price": 50.00
18 }
19 ]
20 }
21}

The data field contains the extracted information matching your extractor’s schema.

Alternative Input Methods

Extract from URL

If your document is publicly accessible, you can pass a URL instead of uploading:

$curl -X POST https://inference.runautomat.com/api/extract \
> -H "Authorization: Bearer YOUR_API_KEY" \
> -F "extractorId=ext_abc123" \
> -F "file=https://example.com/invoice.pdf" \
> -F "mimeType=application/pdf"

When using a URL, you must also provide the mimeType field.

Extract from Base64

For base64-encoded documents:

1const result = await client.extract({
2 extractorId: 'ext_abc123',
3 file: base64EncodedDocument,
4 mimeType: 'application/pdf',
5});

Error Handling

Always handle potential errors in your integration:

1try {
2 const result = await client.extract({
3 extractorId: 'ext_abc123',
4 file: fs.createReadStream('document.pdf'),
5 });
6
7 if (!result.success) {
8 // Handle extraction error (e.g., invalid document)
9 console.error('Extraction error:', result.error);
10 return;
11 }
12
13 // Process successful extraction
14 processData(result.data);
15
16} catch (error) {
17 // Handle network/authentication errors
18 if (error.statusCode === 401) {
19 console.error('Invalid API key');
20 } else if (error.statusCode === 400) {
21 console.error('Bad request:', error.message);
22 } else {
23 console.error('Unexpected error:', error);
24 }
25}

Next Steps