Three endpoints for getting documents into jack — raw text, file upload, and URL. All three are asynchronous and return a document ID you use to track processing status.
POST/v1/ingest
The primary ingest endpoint. Send extracted text with metadata. Accepts up to 50 documents per request. Use this when you extract text yourself from your own storage or database.
Request body
Field
Type
Required
Default
Description
documents
array
Yes
—
List of documents to ingest
documents[].text
string
Yes
—
The document text content
documents[].metadata
object
No
{}
Arbitrary key-value metadata (string values only)
chunk_size
integer
No
512
Token size per chunk (128–4096)
chunk_overlap
integer
No
64
Overlap tokens between chunks (0–512)
Request
python
import httpx
response = httpx.post(
"https://api.usejack.io/v1/ingest",
headers={"Authorization": "Bearer jack_xxxxxxxx"},
json={
"documents": [
{
"text": "Section 4.2 — Remote Work Policy. Employees are permitted to work remotely...",
"metadata": {
"document_type": "policy",
"department": "HR",
"version": "2024-Q1"
}
},
{
"text": "Section 7.1 — Annual Leave. All full-time employees are entitled to 21 days...",
"metadata": {
"document_type": "policy",
"department": "HR",
"version": "2024-Q1"
}
}
]
}
)
print(response.json())
Ingest is asynchronous. Save your document_ids — use them to poll status via GET /v1/documents/{id} and delete documents later.
POST/v1/ingest/file
Upload a file directly as multipart form data. jack extracts the text internally. Use this when you have files on disk or in memory rather than pre-extracted text.
Supported formats:.pdf.docx.txt.md · Max size: 100MB