Schedule a 30-minute live product demo with expert Q&A
Sample cURL to submit a file directly :
curl -X PUT "https://api.talonic.ai/data-extractor/process" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@/path/to/your/file.pdf" \
-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \
-F "description=Optional description of the file"
Sample cURL to submit a file URL :
curl -X PUT "https://api.talonic.ai/data-extractor/process" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file_url=https://example.com/path/to/file.pdf" \
-F "json_schema={\"$schema\":\"http://json-schema.org/draft-07/schema#\",\"type\":\"object\",...}" \
-F "description=Optional description of the file"
Sample cURL to poll job status :
curl -X GET "https://api.talonic.ai/data-extractor/process/YOUR_JOB_ID" \
-H "Authorization: Bearer YOUR_API_KEY"
Submit a file or a file URL along with a JSON schema for processing.
No parameters
No parameters
Request body
(object | object)
One of (object | object)
#0 object
file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
#0 string binarymedia type: application/pdf
.pdf file (Adobe Acrobat)
#1 string binarymedia type: text/csv
.csv file (Comma-Separated Values)
#2 string binarymedia type: application/msword
.doc file (Microsoft Word)
#3 string binarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
.docx file (Microsoft Word)
#4 string binarymedia type: application/vnd.ms-excel
.xls file (Microsoft Excel)
#5 string binarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xlsx file (Microsoft Excel)
#6 string binarymedia type: application/vnd.oasis.opendocument.spreadsheet
.ods file (Open Document Sheet)
#7 string binarymedia type: application/vnd.oasis.opendocument.text
.odt file (Open Document Text)
#8 string binarymedia type: application/vnd.apple.numbers
.numbers file (Apple Numbers)
#9 string binarymedia type: application/vnd.apple.pages
.pages file (Apple Pages)
#10 string binarymedia type: image/jpeg
.jpg file (JPEG Image)
#11 string binarymedia type: image/png
.png file (PNG Image)
#12 string binarymedia type: text/plain
.txt file (Plaintext)
#13 string binarymedia type: audio/mpeg
.mp3 file (MP3 Audio)
#14 string binarymedia type: audio/wav
.wav file (Waveform Audio)
#15 string binarymedia type: audio/ogg
.ogg/.oga file (Ogg Audio)
json_schema stringmedia type: application/json
Stringified JSON schema describing the desired result.
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
validation string
Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation.
Enum array
#0="lax"
#1="strict"
#2="none"
Default="lax"
description string≤ 1000 characters
Optional description of or context for the provided file.
#1 object
file_url stringuri
Publically accessible URL to the file to be processed. (See ProcessRequestFile for supported file formats)
json_schema stringmedia type: application/json
Stringified JSON schema describing the desired result.
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
validation string
Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation.
Enum array
#0="lax"
#1="strict"
#2="none"
Default="lax"
description string≤ 1000 characters
Optional description of or context for the provided file.Response
| Code | Description | Links |
|---|---|---|
| 202 |
Processing request accepted and queued. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-09T06:41:31.848Z",
"estimated_time_seconds": 0,
"message": "string",
"filename": "string"
}
ProcessResponse object
correlation_id string uuid
Unique correlation ID for the request.
job_id string uuid
Unique job ID for polling status.
status string
Initial status of the request.
Enum array
#0"queued"
#1"processing"
#2"failed"
#3"success"
#4"cancelled"
start_time string date-time
ISO 8601 timestamp when processing started.
estimated_time_seconds integer
Estimated time in seconds for the processing to finish. Only present if status is queued or processing.
message string
Informational message about the request.
filename string
Original name of the submitted or linked file, including extension |
No links |
| 400 |
Bad Request. Invalid input parameters. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 413 |
Payload Too Large. Submitted payload is larger than the maximum allowable size. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 415 |
Unsupported Media Type. The server does not support the provided media type. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
Retrieve the status and result of a processing job using its ID.
No parameters
| Name | Description |
|---|---|
|
job_id* string($uuid)(path) |
Unique identifier of the processing job. |
|
include-schema string(query) |
Include JSON schema in response body. Available values : true, false |
include-markdown string(query) |
Include markdown extracted from source data in response body. Available values : true, false |
Response
| Code | Description | Links |
|---|---|---|
| 200 |
Processing status retrieved successfully. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-10T10:08:56.445Z",
"estimated_time_seconds": 0,
"finish_time": "2025-09-10T10:08:56.445Z",
"message": "string",
"filename": "string",
"result": {},
"json_schema": {},
"markdown": "string",
"validation_result": {
"concerns": [
{
"path": "string",
"text": "string",
"level": "error",
"code": "missing_value"
}
],
"summary": "string"
}
}
ProcessStatusResponse object
correlation_id string uuid
Correlation ID of the request.
job_id string uuid
Job ID for polling.
status string
Current status of the processing job.
Enum array
#0"queued"
#1"processing"
#2"failed"
#3"success"
#4"cancelled"
start_time string date-time
ISO 8601 timestamp when processing started.
estimated_time_seconds integer
Estimated time in seconds for the processing to finish. Only present if status is queued or processing.
finish_time string | null date-time
ISO 8601 timestamp when processing finished, or null if not finished.
message string
Status message, if any.
filename string
Original name of the submitted or linked file, including extension
result object | null
Processing result following the provided JSON schema, or null if not finished.
json_schema object
JSON schema used to create the result JSON. Only present if status is finished and include-schema is true.
markdown string
Markdown representation of the source data. Only present if status is finished and include-markdown is true.
validation_result object
Validation result of the extracted JSON. Only present if status is finished and validate is true.
concerns array |
No links |
| 401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 404 |
Not Found. No job found with the provided ID. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
Cancel a running job by process_id.
No parameters
| Name | Description |
|---|---|
|
job_id* string($uuid)(path) |
Unique identifier of the processing job to cancel. |
Response
| Code | Description | Links |
|---|---|---|
| 202 |
Cancellation request accepted. |
No links |
| 400 |
Bad Request. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 404 |
Not Found. No job found with the provided ID. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 409 |
Conflict. Job is already finished or cancelled. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
Submit a file or a file URL to extract markdown only. No conversion or validation is performed.
No parameters
No parameters
Request body
(object | object)
One of (object | object)
#0 object
file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
#0 string binary media type: application/pdf
.pdf file (Adobe Acrobat)
#1 string string binary media type: text/csv
.csv file (Comma-Separated Values)
#2 string binary media type: application/msword
.doc file (Microsoft Word)
#3 string binary media type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
.docx file (Microsoft Word)
#4 string binary media type: application/vnd.ms-excel
.xls file (Microsoft Excel)
#5 string binary media type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xlsx file (Microsoft Excel)
#6 string binary media type: application/vnd.oasis.opendocument.spreadsheet
.ods file (Open Document Sheet)
#7 string binary media type: application/vnd.oasis.opendocument.text
.odt file (Open Document Text)
#8 string binary media type: application/vnd.apple.numbers
.numbers file (Apple Numbers)
#9 string binary media type: application/vnd.apple.pages
.pages file (Apple Pages)
#10 string binary media type: image/jpeg
.jpg file (JPEG Image)
#11 string binary media type: image/png
.png file (PNG Image)
#12 string binary media type: text/plain
.txt file (Plaintext)
#13 string binary media type: audio/mpeg
.mp3 file (MP3 Audio)
#14 string binary media type: audio/wav
.wav file (Waveform Audio)
#15 string binary media type: audio/ogg
.ogg/.oga file (Ogg Audio)
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
description string ≤ 1000 characters
Optional description of or context for the provided file.
#1 object
file_url string uri
Publically accessible URL to the file to be processed. (See ExtractRequestFile for supported file formats)
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
description string ≤ 1000 characters
Optional description of or context for the provided file.Response
| Code | Description | Links |
|---|---|---|
| 202 |
Extraction request accepted and queued. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-11T03:48:18.939Z",
"estimated_time_seconds": 0,
"message": "string",
"filename": "string"
}
ProcessResponse object
correlation_id string uuid
Unique correlation ID for the request.
job_id string uuid
Unique job ID for polling status.
status string
Initial status of the request.
Enum array
#0"queued"
#1"processing"
#2"failed"
#3"success"
#4"cancelled"
start_time string date-time
ISO 8601 timestamp when processing started.
estimated_time_seconds integer
Estimated time in seconds for the processing to finish. Only present if status is queued or processing.
message string
Informational message about the request.
filename string
Original name of the submitted or linked file, including extension |
No links |
| 400 |
Bad Request. Invalid input parameters. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 413 |
Payload Too Large. Submitted payload is larger than the maximum allowable size. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 415 |
Unsupported Media Type. The server does not support the provided media type. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
Submit a file or a file URL to generate a recommended JSON schema only. No conversion or validation is performed.
No parameters
No parameters
Request body
(object | object)
One of (object | object)
#0 object
file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
#0 string binary media type: application/pdf
.pdf file (Adobe Acrobat)
#1 string string binary media type: text/csv
.csv file (Comma-Separated Values)
#2 string binary media type: application/msword
.doc file (Microsoft Word)
#3 string binary media type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
.docx file (Microsoft Word)
#4 string binary media type: application/vnd.ms-excel
.xls file (Microsoft Excel)
#5 string binary media type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xlsx file (Microsoft Excel)
#6 string binary media type: application/vnd.oasis.opendocument.spreadsheet
.ods file (Open Document Sheet)
#7 string binary media type: application/vnd.oasis.opendocument.text
.odt file (Open Document Text)
#8 string binary media type: application/vnd.apple.numbers
.numbers file (Apple Numbers)
#9 string binary media type: application/vnd.apple.pages
.pages file (Apple Pages)
#10 string binary media type: image/jpeg
.jpg file (JPEG Image)
#11 string binary media type: image/png
.png file (PNG Image)
#12 string binary media type: text/plain
.txt file (Plaintext)
#13 string binary media type: audio/mpeg
.mp3 file (MP3 Audio)
#14 string binary media type: audio/wav
.wav file (Waveform Audio)
#15 string binary media type: audio/ogg
.ogg/.oga file (Ogg Audio)
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
description string ≤ 1000 characters
Optional description of or context for the provided file.
#1 object
file_url string uri
Publically accessible URL to the file to be processed. (See ExtractRequestFile for supported file formats)
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
description string ≤ 1000 characters
Optional description of or context for the provided file.Response
| Code | Description | Links |
|---|---|---|
| 202 |
Recommendation request accepted and queued. Media type Controls Accept header.
{
"correlation_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"job_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"status": "queued",
"start_time": "2025-09-13T03:41:56.257Z",
"estimated_time_seconds": 0,
"message": "string",
"filename": "string"
}
ProcessResponse object correlation_id string uuid job_id string uuid status string start_time string date-time estimated_time_seconds integer message string filename string |
No links |
| 400 |
Bad Request. Invalid input parameters. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 413 |
Payload Too Large. Submitted payload is larger than the maximum allowable size. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 415 |
Unsupported Media Type. The server does not support the provided media type. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
Check the availability and API version of service.
Parameters
No parameters
Response
| Code | Description | Links |
|---|---|---|
| 200 |
Server is able to respond to the request. Media type {
"status": "OK",
"version": "1.1.5"
}
object status string Current health of the server. Enum array #0="OK" #1="Unstable" version string Current API version. Examplesarray #0="1.1.5" |
No links |
| 429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
Provide feedback for a finished job in order to help us improve the quality of the results.
No parameters
| Name | Description |
|---|---|
|
job_id* string($uuid)(path) |
Unique identifier of the processing job. |
Request body
ProcessFeedback object
rating integer[1, 5]
Overall rating of the result on a 5-star like scale, 1 being the lowest.
data_complete boolean
Indicates whether or not the data returned by the API is complete, i.e. it contains all relevant and expected data points.
schema_correct boolean
Indicates whether or not the auto-generated schema (if any) looks sensible and useful and contains fields for all relevant and expected data points.
data_correct boolean
Indicates whether or not the data returned by the API seems or is correct, meaning that all it contains no wrong information.
share_data boolean
Indicates whether or not Talonic may temporarily store and look at the source data and any intermediate and final results to further improve our services. See Privacy Policy.
Defaultfalse
additional_feedback string≤ 1000 characters
Text containing any additional feedback we should know.Response
| Code | Description | Links |
|---|---|---|
| 204 |
Feedback accepted. |
No links |
| 400 |
Bad Request. Invalid input parameters. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 401 |
Unauthorized. Missing or invalid API key. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 422 |
Unprocessable Content. Processing is likely not yet finished. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 429 |
Too Many Requests. Wait a minute and try again. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
| 500 |
Server error. Media type {
"detail": "string"
}
ErrorResponse object
detail string
Error message detailing what went wrong. |
No links |
ExtractRequestFile
file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
#0 stringbinarymedia type: application/pdf
.pdf file (Adobe Acrobat)
#1 stringbinarymedia type: text/csv
.csv file (Comma-Separated Values)
#2 stringbinarymedia type: application/msword
.doc file (Microsoft Word)
#3 stringbinarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
.docx file (Microsoft Word)
#4 stringbinarymedia type: application/vnd.ms-excel
.xls file (Microsoft Excel)
#5 stringbinarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xlsx file (Microsoft Excel)
#6 stringbinarymedia type: application/vnd.oasis.opendocument.spreadsheet
.ods file (Open Document Sheet)
#7 stringbinarymedia type: application/vnd.oasis.opendocument.text
.odt file (Open Document Text)
#8 stringbinarymedia type: application/vnd.apple.numbers
.numbers file (Apple Numbers)
#9 stringbinarymedia type: application/vnd.apple.pages
.pages file (Apple Pages)
#10 stringbinarymedia type: image/jpeg
.jpg file (JPEG Image)
#11 stringbinarymedia type: image/png
.png file (PNG Image)
#12 stringbinarymedia type: text/plain
.txt file (Plaintext)
#13 stringbinarymedia type: audio/mpeg
.mp3 file (MP3 Audio)
#14 stringbinarymedia type: audio/wav
.wav file (Waveform Audio)
#15 stringbinarymedia type: audio/ogg
.ogg/.oga file (Ogg Audio)
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
description string≤ 1000 characters
Optional description of or context for the provided file.ExtractRequestFileURL
file_url string uri
Publically accessible URL to the file to be processed. (See ExtractRequestFile for supported file formats)
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
description string≤ 1000 characters
Optional description of or context for the provided file.RecommendRequestFile
file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
#0 stringbinarymedia type: application/pdf
.pdf file (Adobe Acrobat)
#1 stringbinarymedia type: text/csv
.csv file (Comma-Separated Values)
#2 stringbinarymedia type: application/msword
.doc file (Microsoft Word)
#3 stringbinarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
.docx file (Microsoft Word)
#4 stringbinarymedia type: application/vnd.ms-excel
.xls file (Microsoft Excel)
#5 stringbinarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xlsx file (Microsoft Excel)
#6 stringbinarymedia type: application/vnd.oasis.opendocument.spreadsheet
.ods file (Open Document Sheet)
#7 stringbinarymedia type: application/vnd.oasis.opendocument.text
.odt file (Open Document Text)
#8 stringbinarymedia type: application/vnd.apple.numbers
.numbers file (Apple Numbers)
#9 stringbinarymedia type: application/vnd.apple.pages
.pages file (Apple Pages)
#10 stringbinarymedia type: image/jpeg
.jpg file (JPEG Image)
#11 stringbinarymedia type: image/png
.png file (PNG Image)
#12 stringbinarymedia type: text/plain
.txt file (Plaintext)
#13 stringbinarymedia type: audio/mpeg
.mp3 file (MP3 Audio)
#14 stringbinarymedia type: audio/wav
.wav file (Waveform Audio)
#15 stringbinarymedia type: audio/ogg
.ogg/.oga file (Ogg Audio)
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
description string≤ 1000 characters
Optional description of or context for the provided file.RecommendRequestFileURL
file_url strin guri
Publically accessible URL to the file to be processed. (See RecommendRequestFile for supported file formats)
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
description string≤ 1000 characters
Optional description of or context for the provided file.ProcessRequestFile
file (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
Any of (string | string | string | string | string | string | string | string | string | string | string | string | string | string | string | string)
#0 stringbinarymedia type: application/pdf
.pdf file (Adobe Acrobat)
#1 stringbinarymedia type: text/csv
.csv file (Comma-Separated Values)
#2 stringbinarymedia type: application/msword
.doc file (Microsoft Word)
#3 stringbinarymedia type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
.docx file (Microsoft Word)
#4 stringbinarymedia type: application/vnd.ms-excel
.xls file (Microsoft Excel)
#5 stringbinarymedia type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
.xlsx file (Microsoft Excel)
#6 stringbinarymedia type: application/vnd.oasis.opendocument.spreadsheet
.ods file (Open Document Sheet)
#7 stringbinarymedia type: application/vnd.oasis.opendocument.text
.odt file (Open Document Text)
#8 stringbinarymedia type: application/vnd.apple.numbers
.numbers file (Apple Numbers)
#9 stringbinarymedia type: application/vnd.apple.pages
.pages file (Apple Pages)
#10 stringbinarymedia type: image/jpeg
.jpg file (JPEG Image)
#11 stringbinarymedia type: image/png
.png file (PNG Image)
#12 stringbinarymedia type: text/plain
.txt file (Plaintext)
#13 stringbinarymedia type: audio/mpeg
.mp3 file (MP3 Audio)
#14 stringbinarymedia type: audio/wav
.wav file (Waveform Audio)
#15 stringbinarymedia type: audio/ogg
.ogg/.oga file (Ogg Audio)
json_schema stringmedia type: application/json
Stringified JSON schema describing the desired result.
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
validation string
Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation.
Enum array
#0="lax"
#1="strict"
#2="none"
Default="lax"
description string≤ 1000 characters
Optional description of or context for the provided file.ProcessRequestFileURL
file_url string uri
Publically accessible URL to the file to be processed. (See ProcessRequestFile for supported file formats)
json_schema stringmedia type: application/json
Stringified JSON schema describing the desired result.
fast_extraction string
Enable fast extraction method for simple documents, which significantly increases processing speed, but potentially reduces accuracy.
Enum array
#0=true
#1=false
Default=false
validation string
Validation policy to apply. 'lax' collects concerns but does not fail the job; 'strict' fails if max_errors is reached or overall invalid; 'none' disables validation.
Enum array
#0="lax"
#1="strict"
#2="none"
Default"lax"
description string≤ 1000 characters
Optional description of or context for the provided file.ProcessResponse
correlation_id string uuid
Unique correlation ID for the request.
job_id stringuuid
Unique job ID for polling status.
status string
Initial status of the request.
Enum array
#0="queued"
#1="processing"
#2="failed"
#3="success"
#4="cancelled"
start_time stringdate-time
ISO 8601 timestamp when processing started.
estimated_time_seconds integer
Estimated time in seconds for the processing to finish. Only present if status is queued or processing.
message string
Informational message about the request.
filename string
Original name of the submitted or linked file, including extensionProcessValidationResult
concerns array object
List of concerns with their JSON paths.
Items object
path string
JSON path of the field related to the concern
text string
Human-readable description of the concern
level string
Severity level of the concern
Enum array
#0="error"
#1="warning"
#2="info"
code string
Code of the concern
Enum array
#0"missing_value"
#1="null_value"
#2="additional_value"
#3="format_inconsistent"
#4="numeric_mismatch"
#5="floating_precision_diff"
#6="semantic_conflict"
#7="array_length_mismatch"
#8="type_mismatch"
#9="out_of_range"
#10="duplicate_value"
#11="incomplete_object"
#12="extra_fields"
#13="order_difference"
#14="array_reordered"
#15="array_deduplicated"
#16="numeric_precision_normalized"
#17="date_format_normalized"
#18="optional_field_merged"
#19="llm_value_selected"
summary string
Executive summary of the validation results.ProcessStatusResponse
correlation_id stringuuid
Correlation ID of the request.
job_id stringuuid
Job ID for polling.
status string
Current status of the processing job.
Enum array
#0"queued"
#1"processing"
#2"failed"
#3"success"
#4"cancelled"
start_time stringdate-time
ISO 8601 timestamp when processing started.
estimated_time_seconds integer
Estimated time in seconds for the processing to finish. Only present if status is queued or processing.
finish_time string | nulldate-time
ISO 8601 timestamp when processing finished, or null if not finished.
message string
Status message, if any.
filename string
Original name of the submitted or linked file, including extension
result object | null
Processing result following the provided JSON schema, or null if not finished.
json_schema object
JSON schema used to create the result JSON. Only present if status is finished and include-schema is true.
markdown string
Markdown representation of the source data. Only present if status is finished and include-markdown is true.
validation_result object
Validation result of the extracted JSON. Only present if status is finished and validate is true.
concerns array object
List of concerns with their JSON paths.
Items object
path string
JSON path of the field related to the concern
text string
Human-readable description of the concern
level string
Severity level of the concern
Enum array
#0="error"
#1="warning"
#2="info"
code string
Code of the concern
Enum array
#0="missing_value"
#1="null_value"
#2="additional_value"
#3="format_inconsistent"
#4="numeric_mismatch"
#5="floating_precision_diff"
#6="semantic_conflict"
#7="array_length_mismatch"
#8="type_mismatch"
#9="out_of_range"
#10="duplicate_value"
#11="incomplete_object"
#12="extra_fields"
#13="order_difference"
#14="array_reordered"
#15="array_deduplicated"
#16="numeric_precision_normalized"
#17="date_format_normalized"
#18="optional_field_merged"
#19="llm_value_selected"
summary string
Executive summary of the validation results.ProcessFeedback
rating integer[1, 5]
Overall rating of the result on a 5-star like scale, 1 being the lowest.
data_complete boolean
Indicates whether or not the data returned by the API is complete, i.e. it contains all relevant and expected data points.
schema_correct boolean
Indicates whether or not the auto-generated schema (if any) looks sensible and useful and contains fields for all relevant and expected data points.
data_correct boolean
Indicates whether or not the data returned by the API seems or is correct, meaning that all it contains no wrong information.
share_data boolean
Indicates whether or not Talonic may temporarily store and look at the source data and any intermediate and final results to further improve our services. See Privacy Policy.
Default=false
additional_feedback string≤ 1000 characters
Text containing any additional feedback we should know.ErrorResponse
detail string Error message detailing what went wrong.





.png)