-
Schedule a 30-minute live product demo with expert Q&A
Sample cURL to submit a file directly :
curl -X PUT "https://api.talonic.ai/data-extractor/process" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@/path/to/your/file.pdf" \
-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \
-F "description=Optional description of the file"
Sample cURL to submit a file URL :
curl -X PUT "https://api.talonic.ai/data-extractor/process" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file_url=https://example.com/path/to/file.pdf" \
-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \
-F "description=Optional description of the file"
Sample cURL to poll job status :
curl -X GET "https://api.talonic.ai/data-extractor/process/YOUR_JOB_ID" \
-H "Authorization: Bearer YOUR_API_KEY"
Submit a file or a file URL along with a JSON schema for processing.
No parameters
No parameters
Request body
(object | object) One of (object | object) #0 object file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string) #0 string binarymedia type: application/pdf .pdf file (Adobe Acrobat) #1 string binarymedia type: text/csv .csv file (Comma-Separated Values) #2 string binarymedia type: application/msword .doc file (Microsoft Word) #3 string binarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document .docx file (Microsoft Word) #4 string binarymedia type: application/vnd.ms-excel .xls file (Microsoft Excel) #5 string binarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet .xlsx file (Microsoft Excel) #6 string binarymedia type: application/vnd.oasis.opendocument.spreadsheet .ods file (Open Document Sheet) #7 string binarymedia type: application/vnd.oasis.opendocument.text .odt file (Open Document Text) #8 string binarymedia type: application/vnd.apple.numbers .numbers file (Apple Numbers) #9 string binarymedia type: application/vnd.apple.pages .pages file (Apple Pages) #10 string binarymedia type: image/jpeg .jpg file (JPEG Image) #11 string binarymedia type: image/png .png file (PNG Image) #12 string binarymedia type: text/plain .txt file (Plaintext) #13 string binarymedia type: audio/mpeg .mp3 file (MP3 Audio) #14 string binarymedia type: audio/wav .wav file (Waveform Audio) #15 string binarymedia type: audio/ogg .ogg/.oga file (Ogg Audio) json_schema stringmedia type: application/json Stringified JSON schema describing the desired result. fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false validation string Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation. Enum array #0="lax" #1="strict" #2="none" Default="lax" description string≤ 1000 characters Optional description of or context for the provided file. #1 object file_url stringuri Publically accessible URL to the file to be processed. (See ProcessRequestFile for supported file formats) json_schema stringmedia type: application/json Stringified JSON schema describing the desired result. fast_extraction string Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy. Enum array #0=true #1=false Default=false validation string Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation. Enum array #0="lax" #1="strict" #2="none" Default="lax" description string≤ 1000 characters Optional description of or context for the provided file.
Response
Code | Description | Links |
---|---|---|
202 |
Processing request accepted and queued. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-09T06:41:31.848Z",
"estimated_time_seconds": 0,
"message": "string",
"filename": "string"
}
ProcessResponse object correlation_id string uuid Unique correlation ID for the request. job_id string uuid Unique job ID for polling status. status string Initial status of the request. Enum array #0"queued" #1"processing" #2"failed" #3"success" #4"cancelled" start_time string date-time ISO 8601 timestamp when processing started. estimated_time_seconds integer Estimated time in seconds for the processing to finish. Only present if status is queued or processing. message string Informational message about the request. filename string Original name of the submitted or linked file, including extension |
No links |
400 |
Bad Request. Invalid input parameters. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
413 |
Payload Too Large. Submitted payload is larger than the maximum allowable size. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
415 |
Unsupported Media Type. The server does not support the provided media type. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
Retrieve the status and result of a processing job using its ID.
No parameters
Name | Description |
---|---|
Processing request accepted and queued. Media type Controls Accept header.
|
|
400 |
Bad Request. Invalid input parameters. Media type |
401 |
Unauthorized. Missing or invalid API key. Media type |
413 |
Payload Too Large. Submitted payload is larger than the maximum allowable size. Media type |
Response
Code | Description | Links |
---|---|---|
200 |
Processing status retrieved successfully. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-10T10:08:56.445Z",
"estimated_time_seconds": 0,
"finish_time": "2025-09-10T10:08:56.445Z",
"message": "string",
"filename": "string",
"result": {},
"json_schema": {},
"markdown": "string",
"validation_result": {
"concerns": [
{
"path": "string",
"text": "string",
"level": "error",
"code": "missing_value"
}
],
"summary": "string"
}
}
ProcessStatusResponse object correlation_id string uuid Correlation ID of the request. job_id string uuid Job ID for polling. status string Current status of the processing job. Enum array #0"queued" #1"processing" #2"failed" #3"success" #4"cancelled" start_time string date-time ISO 8601 timestamp when processing started. estimated_time_seconds integer Estimated time in seconds for the processing to finish. Only present if status is queued or processing. finish_time string | null date-time ISO 8601 timestamp when processing finished, or null if not finished. message string Status message, if any. filename string Original name of the submitted or linked file, including extension result object | null Processing result following the provided JSON schema, or null if not finished. json_schema object JSON schema used to create the result JSON. Only present if status is finished and include-schema is true. markdown string Markdown representation of the source data. Only present if status is finished and include-markdown is true. validation_result object Validation result of the extracted JSON. Only present if status is finished and validate is true. concerns array |
No links |
401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
404 |
Not Found. No job found with the provided ID. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
Cancel a running job by process_id.
No parameters
Name | Description |
---|---|
Processing request accepted and queued. Media type Controls Accept header.
|
|
400 |
Bad Request. Invalid input parameters. Media type |
401 |
Unauthorized. Missing or invalid API key. Media type |
413 |
Payload Too Large. Submitted payload is larger than the maximum allowable size. Media type |
Response
Code | Description | Links |
---|---|---|
202 |
Cancellation request accepted. |
No links |
400 |
Bad Request. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
404 |
Not Found. No job found with the provided ID. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
409 |
Conflict. Job is already finished or cancelled. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |
500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object detail string Error message detailing what went wrong. |
No links |