-

Quickstart

This API allows you to submit a file (or file URL) along with a JSON schema that describes the structure of the data you want to extract. Once submitted, the request is queued for processing, and you can later poll for the result.

How It Works :

1. Requirements and Optionals :

  • A file in a supported media type, see below.
  • A Talonic API key. Contact Talonic for details.
  • Optional: A valid JSON schema. See JSON-Schema.org for instructions.
  • Optional: A description of the data contained in the file; increases accuracy.

2. Submit a Request :

  • Use the /process endpoint to submit a full job (extract + optional recommend + convert + optional validate). You can either upload a file or provide a URL to one, along with the JSON schema describing the expected results.
  • Alternatively, use /extract to only extract markdown from the source without conversion, or /recommend to only generate a recommended JSON schema for the source.

Sample cURL to submit a file directly :

curl -X PUT "https://api.talonic.ai/data-extractor/process" \

-H "Authorization: Bearer YOUR_API_KEY" \

-F "file=@/path/to/your/file.pdf" \

-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \

-F "description=Optional description of the file"

Sample cURL to submit a file URL :

curl -X PUT "https://api.talonic.ai/data-extractor/process" \

-H "Authorization: Bearer YOUR_API_KEY" \

-F "file_url=https://example.com/path/to/file.pdf" \

-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \

-F "description=Optional description of the file"

3. Poll for Status :

  • To check the status and get the result of your processing job use the /process/{job_id} endpoint with the provided job_id.

Sample cURL to poll job status :

curl -X GET "https://api.talonic.ai/data-extractor/process/YOUR_JOB_ID" \

-H "Authorization: Bearer YOUR_API_KEY"

  • The response (ProcessStatusResponse) will show the current status of the conversion.
  • If "successful", it will also include the extracted data according to your JSON schema.

Notes :

  • Replace YOUR_API_KEY with your actual API key.
  • Replace placeholders like /path/to/your/file.pdf and YOUR_JOB_ID with your actual file path and job identifier.
  • Use the json_schema field to clearly define what data you expect to be extracted from the file.
  • The description can be used to provide additional context and information about the file to the system that may be necessary for proper extraction and/or mapping.
  • If a file_url is submitted, ensure that it is publicly accessible. Any errors in file validation will result in a "failed" processing status.

As the API is currently in testing, all endpoints and schemas are subject to change.

Servers
Computed URL: https://api.talonic.ai/data-extractor
Server variables

Processing

PUT
/process
Submit a Processing Request

Submit a file or a file URL along with a JSON schema for processing.

No parameters

No parameters

Request body


{ "file": "", "json_schema": { "$schema": "http://json-schema.org/draft-07/schema#", "title": "Invoice", "description": "ACME Invoice", "type": "object", "properties": { "invoiceId": { "type": "string", "description": "A unique identifier for the invoice.", "pattern": "^[A-Z]{2,3}-\\d{6}$", "examples": [ "INV-000001", "AB-123456" ] }, "date": { "type": "string", "description": "The date when the invoice was issued, in YYYY-MM-DD format.", "pattern": "^\\d{4}-\\d{2}-\\d{2}$", "examples": [ "2025-01-01", "2024-12-31" ] }, "dueDate": { "type": "string", "description": "The payment due date for the invoice, in YYYY-MM-DD format.", "pattern": "^\\d{4}-\\d{2}-\\d{2}$", "examples": [ "2025-01-15", "2024-12-31" ] }, "billTo": { "type": "object", "description": "Details of the entity being billed.", "properties": { "name": { "type": "string", "description": "Name of the customer or client.", "examples": [ "Acme Corporation", "John Doe" ] }, address": { "type": "string", "description": "Billing address of the customer or client.", "examples": [ "123 Main St, Anytown, USA", "456 Elm St, Othertown, USA" ] }, "email": { "type": "string", "description": "Email address of the customer or client.", "format": "email", "examples": [ "contact@acme.com", "johndoe@example.com" ] } }, "required": [ "name", "address", "email" ] }, "items": { "type": "array", "description": "List of items or services included in the invoice.", "items": { "type": "object", "properties": { "description": { "type": "string", "description": "Description of the item or service.", "examples": [ "Web design services", "Consulting hours" ] }, "quantity": { "type": "integer", "description": "Quantity of the item or hours of service.", "minimum": 1, "examples": [ 10, 5 ] }, "unitPrice": { "type": "number", "description": "Price per single unit or hour, in the specified currency.", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 150, 75.5 ] }, "total": { "type": "number", "description": "Total price for the item (quantity * unitPrice).", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 1500, 377.5 ] } }, "required": [ "description", "quantity", "unitPrice", "total" ] }, "minItems": 1 }, "subtotal": { "type": "number", "description": "Sum of all item totals before taxes and discounts.", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 1877.5 ] }, "tax": { "type": "number", "description": "Tax amount applied to the subtotal.", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 150 ] }, "total": { "type": "number", "description": "Total amount due, including taxes and any additional charges.", "minimum": 0, "pattern": "^[0-9]+(\\.[0-9]{2})$", "examples": [ 2027.5 ] }, "currency": { "type": "string", "description": "ISO 4217 currency code.", "pattern": "^[A-Z]{3}$", "examples": [ "USD", "EUR" ] }, "terms": { "type": "string", "description": "Payment terms and conditions.", "examples": [ "Payment is due within 15 days.", "Net 30 days." ] } }, "required": [ "invoiceId", "date", "dueDate", "billTo", "items", "subtotal", "tax", "total", "currency" ] }, "fast_extraction": false, "description": "Generic invoice document for shop orders, all values are in USD if not otherwise stated." }

Response

Code Description Links
202

Processing request accepted and queued.

Media type

Controls Accept header.
{ "correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "status": "queued", "start_time": "2025-09-09T06:41:31.848Z", "estimated_time_seconds": 0, "message": "string", "filename": "string" }
400

Bad Request. Invalid input parameters.

Media type

{ "detail": "string" }
401

Unauthorized. Missing or invalid API key.

Media type

{ "detail": "string" }
413

Payload Too Large. Submitted payload is larger than the maximum allowable size.

Media type

{ "detail": "string" }
API Accordion Example
GET
/process/{job_id}
Get Processing Status

Retrieve the status and result of a processing job using its ID.

No parameters

Name Description

Processing request accepted and queued.

Media type

Controls Accept header.
400

Bad Request. Invalid input parameters.

Media type

401

Unauthorized. Missing or invalid API key.

Media type

413

Payload Too Large. Submitted payload is larger than the maximum allowable size.

Media type

Response

Code Description Links
202

Processing request accepted and queued.

Media type

Controls Accept header.
{ "correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "status": "queued", "start_time": "2025-09-09T06:41:31.848Z", "estimated_time_seconds": 0, "message": "string", "filename": "string" }
400

Bad Request. Invalid input parameters.

Media type

{ "detail": "string" }
401

Unauthorized. Missing or invalid API key.

Media type

{ "detail": "string" }
413

Payload Too Large. Submitted payload is larger than the maximum allowable size.

Media type

{ "detail": "string" }