Async Jobs (Beta)¶

Async jobs let you submit long-running endpoint invocations without waiting for the result. The caller receives a job ID immediately and polls for completion. Jobs are processed through a managed queue with priority lanes, automatic retry, and failure handling.

When to Use Async Jobs¶

Use async invocation when:

The endpoint processes large datasets or performs expensive computation
You need to decouple the caller from the execution timeline
The operation may take longer than the sync timeout (default 30 seconds)
You want built-in retry on failure

Endpoint must opt in

Only endpoints with async_eligible: true accept async invocations. Attempting to submit an async job to a non-eligible endpoint returns 400 not_async_eligible.

Submitting a Job¶

Submit an async job by calling the /async/ route on the Service Gateway:

PythonCLI

import httpx

resp = httpx.post(
    f"{api_url}/api/v1/orgs/{org_id}/services/{service_id}/async/analyze",
    headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json",
        "X-HSL-Lane": "standard",           # optional: priority lane
        "X-Idempotency-Key": "batch-42",    # optional: prevent duplicates
    },
    json={"content": "Long document...", "analysis_type": "entities"},
)

job = resp.json()
print(f"Job submitted: {job['job_id']}")
print(f"Poll at: {job['poll_url']}")
# {
#   "job_id": "abc-123",
#   "status": "staged",
#   "poll_url": "/api/v1/orgs/.../services/.../jobs/abc-123"
# }

curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/analyze" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "X-HSL-Lane: standard" \
  -d '{"content": "Long document...", "analysis_type": "entities"}'

Versioned Async Submission¶

Target a specific service version:

curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/v2/async/analyze" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"content": "Long document..."}'

Target Method Header¶

Async submissions always use POST as the HTTP method. If the underlying endpoint uses a different method (e.g., PUT), specify it with the X-HSL-Target-Method header:

curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/items/123" \
  -H "X-HSL-Target-Method: PUT" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status": "processed"}'

Job Lifecycle¶

stateDiagram-v2
    [*] --> Staged: Job submitted
    Staged --> Queued: Enqueued
    Queued --> Dispatching: Picked up
    Dispatching --> Processing: Execution started
    Processing --> Completed: Success
    Processing --> Retrying: Transient failure
    Retrying --> Queued: Re-enqueued
    Retrying --> Failed: Max retries exhausted
    Processing --> CancelRequested: Cancel requested
    Staged --> CancelRequested: Cancel requested
    Queued --> CancelRequested: Cancel requested
    CancelRequested --> Cancelled: Acknowledged
    Failed --> [*]
    Completed --> [*]
    Cancelled --> [*]

Job Statuses¶

Status	Description
`staged`	Job record created, not yet submitted to the queue
`queued`	Submitted to the job queue
`dispatching`	Job picked up, preparing execution
`processing`	Endpoint code is executing
`completed`	Execution finished successfully
`retrying`	Failed, will be retried
`failed`	Execution failed (non-retryable)
`failed`	All retry attempts exhausted
`cancel_requested`	Cancellation requested, waiting for acknowledgement
`cancelled`	Job cancelled

Polling for Status¶

Poll the job status endpoint to track progress:

PythonCLI

import time

poll_url = f"{api_url}/api/v1/orgs/{org_id}/services/{service_id}/jobs/{job_id}"

while True:
    resp = httpx.get(poll_url, headers={"Authorization": f"Bearer {token}"})
    job = resp.json()

    print(f"Status: {job['status']}")

    if job["status"] in ("completed", "failed", "cancelled"):
        break

    time.sleep(2)  # Poll every 2 seconds

if job["status"] == "completed":
    print(f"Result: {job['result_summary']}")
else:
    print(f"Error: {job.get('last_error')}")

curl "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/jobs/$JOB_ID" \
  -H "Authorization: Bearer $TOKEN"

Batch Status Check¶

Check multiple jobs in a single request:

curl "$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/jobs/batch-status?job_ids=$JOB_1,$JOB_2,$JOB_3" \
  -H "Authorization: Bearer $TOKEN"

Job Response Fields¶

{
  "id": "abc-123",
  "org_id": "...",
  "workspace_id": "...",
  "service_id": "...",
  "version_id": "...",
  "endpoint_id": "...",
  "endpoint_key": "post__analyze",
  "execution_mode": "invocation",
  "lane": "standard",
  "status": "completed",
  "idempotency_key": "batch-42",
  "execution_id": null,
  "attempt_count": 1,
  "result_summary": {
    "result": {"entities": ["..."]},
    "confidence": 0.95
  },
  "last_error": null,
  "pre_dispatch_age_seconds": 0.042,
  "stalled": false,
  "stalled_reason": null,
  "started_at": "2025-03-15T10:00:01Z",
  "completed_at": "2025-03-15T10:00:15Z",
  "expires_at": null,
  "created_at": "2025-03-15T10:00:00Z",
  "updated_at": "2025-03-15T10:00:15Z"
}

Priority Lanes¶

Jobs are assigned to priority lanes that control processing order and retry behavior. Specify the lane with the X-HSL-Lane header:

Lane	Max Retries	Ack Timeout	Use case
`urgent`	5	30s	Time-sensitive operations
`standard`	3	120s	Normal workloads (default)
`bulk`	3	300s	Batch processing, large data

curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/process" \
  -H "X-HSL-Lane: urgent" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"priority": "high", "data": "..."}'

Each lane has independent concurrency and retry settings, so urgent jobs are not blocked behind a backlog of bulk work.

Idempotency¶

Prevent duplicate job submissions by including an idempotency key:

curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/process" \
  -H "X-Idempotency-Key: import-batch-2025-03-15" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"batch_id": "2025-03-15"}'

If a job with the same idempotency key already exists for the organization, the platform returns the existing job instead of creating a duplicate. Idempotency keys are scoped per organization.

Job Cancellation¶

Request cancellation of a running or queued job:

PythonCLI

resp = httpx.post(
    f"{api_url}/api/v1/orgs/{org_id}/services/{service_id}/jobs/{job_id}/cancel",
    headers={"Authorization": f"Bearer {token}"},
)
print(f"Cancel result: {resp.json()}")

curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/jobs/$JOB_ID/cancel" \
  -H "Authorization: Bearer $TOKEN"

Cancellation is best-effort

If the job is already in the processing state, the platform requests cancellation but the running code may complete before the signal is received. Jobs in completed, failed, or cancelled states cannot be cancelled and return 409 Conflict.

Cancellation flow¶

The job status is set to cancel_requested.
At the next check point in the dispatcher, the cancellation signal is detected.
If the job has a child execution (persistent mode), the child execution is also cancelled.
The job transitions to cancelled.

Retry and Failure Handling¶

When a job fails, the dispatcher follows the lane's retry policy:

First failure -- job status set to retrying, message re-enqueued for retry.
Subsequent failures -- same process, with attempt_count incrementing.
Max retries exhausted -- job moves to failed status and an alert is generated.

The error is captured in the job's last_error field:

{
  "last_error": {
    "code": "max_retries_exhausted",
    "message": "Platform execution failed after retries"
  }
}

Failed jobs are not automatically retried. They require manual investigation through the job listing API or platform UI.

Listing Jobs¶

List all jobs for a service:

curl "$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/jobs?offset=0&limit=50" \
  -H "Authorization: Bearer $TOKEN"

Response:

{
  "items": [
    {"id": "...", "status": "completed", "lane": "standard", "...": "..."},
    {"id": "...", "status": "processing", "lane": "urgent", "...": "..."}
  ],
  "total": 42
}

Job Authentication¶

Async jobs inherit the authentication context of the original submission. When polling or cancelling a job:

Platform auth endpoints -- the same user (by user_id) who submitted the job must poll/cancel it.
Service token endpoints -- the same service token (matching custom claims) must be used.
No-auth endpoints -- anyone can poll the job.

This ensures that job results are only accessible to the caller who submitted the work.