Ingest

Three endpoints for getting documents into jack — raw text, file upload, and URL. All three are asynchronous and return a document ID you use to track processing status.

POST/v1/ingest

The primary ingest endpoint. Send extracted text with metadata. Accepts up to 50 documents per request. Use this when you extract text yourself from your own storage or database.

Request body

Field	Type	Required	Default	Description
documents	array	Yes	—	List of documents to ingest
documents[].text	string	Yes	—	The document text content
documents[].metadata	object	No	{}	Arbitrary key-value metadata (string values only)
chunk_size	integer	No	512	Token size per chunk (128–4096)
chunk_overlap	integer	No	64	Overlap tokens between chunks (0–512)

Request

python

import httpx

response = httpx.post(
    "https://api.usejack.io/v1/ingest",
    headers={"Authorization": "Bearer jack_xxxxxxxx"},
    json={
        "documents": [
            {
                "text": "Section 4.2 — Remote Work Policy. Employees are permitted to work remotely...",
                "metadata": {
                    "document_type": "policy",
                    "department": "HR",
                    "version": "2024-Q1"
                }
            },
            {
                "text": "Section 7.1 — Annual Leave. All full-time employees are entitled to 21 days...",
                "metadata": {
                    "document_type": "policy",
                    "department": "HR",
                    "version": "2024-Q1"
                }
            }
        ]
    }
)

print(response.json())

bash

curl -X POST https://api.usejack.io/v1/ingest \
  -H "Authorization: Bearer jack_xxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [
      {"text": "...", "metadata": {"document_type": "policy", "department": "HR"}}
    ]
  }'

Response — 202 Accepted

json

{
  "status": "accepted",
  "org_id": "your-org-id",
  "queued": 2,
  "document_ids": [
    "f6c5ea1b-4d9b-4fbf-bb18-b10b97680340",
    "a1b2c3d4-5e6f-7890-abcd-ef1234567890"
  ]
}

Ingest is asynchronous. Save your document_ids — use them to poll status via GET /v1/documents/{id} and delete documents later.

POST/v1/ingest/file

Upload a file directly as multipart form data. jack extracts the text internally. Use this when you have files on disk or in memory rather than pre-extracted text.

Supported formats: .pdf .docx .txt .md · Max size: 100MB

Request

python

with open("employee-handbook.pdf", "rb") as f:
    response = httpx.post(
        "https://api.usejack.io/v1/ingest/file",
        headers={"Authorization": "Bearer jack_xxxxxxxx"},
        files={"file": ("employee-handbook.pdf", f, "application/pdf")},
        data={
            "metadata": '{"document_type": "handbook", "department": "HR"}'
        }
    )

print(response.json()["document_ids"])

bash

curl -X POST https://api.usejack.io/v1/ingest/file \
  -H "Authorization: Bearer jack_xxxxxxxx" \
  -F "file=@employee-handbook.pdf" \
  -F 'metadata={"document_type": "handbook", "department": "HR"}'

Response — 202 Accepted

json

{
  "status": "accepted",
  "org_id": "your-org-id",
  "queued": 1,
  "document_ids": ["f6c5ea1b-4d9b-4fbf-bb18-b10b97680340"]
}

POST/v1/ingest/url

Pass a URL — jack downloads and indexes it server-side. Works with pre-signed S3 URLs, MinIO, Cloudflare R2, or any publicly reachable URL.

The URL must be reachable by jack's servers. Pre-signed URLs must remain valid for at least 60 seconds.

Request body

Field	Type	Required	Description
url	string	Yes	HTTP/HTTPS URL pointing to a supported file
metadata	object	No	Arbitrary key-value metadata (string values only)

Request

python

response = httpx.post(
    "https://api.usejack.io/v1/ingest/url",
    headers={"Authorization": "Bearer jack_xxxxxxxx"},
    json={
        "url": "https://your-bucket.s3.amazonaws.com/contracts/msa-001.pdf?X-Amz-Signature=...",
        "metadata": {
            "document_type": "contract",
            "client_id": "CLIENT-001"
        }
    }
)

print(response.json()["document_ids"])

bash

curl -X POST https://api.usejack.io/v1/ingest/url \
  -H "Authorization: Bearer jack_xxxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://...", "metadata": {"document_type": "contract"}}'

Response — 202 Accepted

json

{
  "status": "accepted",
  "org_id": "your-org-id",
  "queued": 1,
  "document_ids": ["f6c5ea1b-4d9b-4fbf-bb18-b10b97680340"]
}

← PREVIOUSMetadata & Filtering NEXT →Query