Skip to content

Async Jobs (Beta)

Async jobs let you submit long-running endpoint invocations without waiting for the result. The caller receives a job ID immediately and polls for completion. Jobs are processed through a managed queue with priority lanes, automatic retry, and failure handling.

When to Use Async Jobs

Use async invocation when:

  • The endpoint processes large datasets or performs expensive computation
  • You need to decouple the caller from the execution timeline
  • The operation may take longer than the sync timeout (default 30 seconds)
  • You want built-in retry on failure

Endpoint must opt in

Only endpoints with async_eligible: true accept async invocations. Attempting to submit an async job to a non-eligible endpoint returns 400 not_async_eligible.

Submitting a Job

Submit an async job by calling the /async/ route on the Service Gateway:

import httpx

resp = httpx.post(
    f"{api_url}/api/v1/orgs/{org_id}/services/{service_id}/async/analyze",
    headers={
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json",
        "X-HSL-Lane": "standard",           # optional: priority lane
        "X-Idempotency-Key": "batch-42",    # optional: prevent duplicates
    },
    json={"content": "Long document...", "analysis_type": "entities"},
)

job = resp.json()
print(f"Job submitted: {job['job_id']}")
print(f"Poll at: {job['poll_url']}")
# {
#   "job_id": "abc-123",
#   "status": "staged",
#   "poll_url": "/api/v1/orgs/.../services/.../jobs/abc-123"
# }
curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/analyze" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "X-HSL-Lane: standard" \
  -d '{"content": "Long document...", "analysis_type": "entities"}'

Versioned Async Submission

Target a specific service version:

curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/v2/async/analyze" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"content": "Long document..."}'

Target Method Header

Async submissions always use POST as the HTTP method. If the underlying endpoint uses a different method (e.g., PUT), specify it with the X-HSL-Target-Method header:

curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/items/123" \
  -H "X-HSL-Target-Method: PUT" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status": "processed"}'

Job Lifecycle

stateDiagram-v2
    [*] --> Staged: Job submitted
    Staged --> Queued: Enqueued
    Queued --> Dispatching: Picked up
    Dispatching --> Processing: Execution started
    Processing --> Completed: Success
    Processing --> Retrying: Transient failure
    Retrying --> Queued: Re-enqueued
    Retrying --> Failed: Max retries exhausted
    Processing --> CancelRequested: Cancel requested
    Staged --> CancelRequested: Cancel requested
    Queued --> CancelRequested: Cancel requested
    CancelRequested --> Cancelled: Acknowledged
    Failed --> [*]
    Completed --> [*]
    Cancelled --> [*]

Job Statuses

Status Description
staged Job record created, not yet submitted to the queue
queued Submitted to the job queue
dispatching Job picked up, preparing execution
processing Endpoint code is executing
completed Execution finished successfully
retrying Failed, will be retried
failed Execution failed (non-retryable)
failed All retry attempts exhausted
cancel_requested Cancellation requested, waiting for acknowledgement
cancelled Job cancelled

Polling for Status

Poll the job status endpoint to track progress:

import time

poll_url = f"{api_url}/api/v1/orgs/{org_id}/services/{service_id}/jobs/{job_id}"

while True:
    resp = httpx.get(poll_url, headers={"Authorization": f"Bearer {token}"})
    job = resp.json()

    print(f"Status: {job['status']}")

    if job["status"] in ("completed", "failed", "cancelled"):
        break

    time.sleep(2)  # Poll every 2 seconds

if job["status"] == "completed":
    print(f"Result: {job['result_summary']}")
else:
    print(f"Error: {job.get('last_error')}")
curl "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/jobs/$JOB_ID" \
  -H "Authorization: Bearer $TOKEN"

Batch Status Check

Check multiple jobs in a single request:

curl "$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/jobs/batch-status?job_ids=$JOB_1,$JOB_2,$JOB_3" \
  -H "Authorization: Bearer $TOKEN"

Job Response Fields

{
  "id": "abc-123",
  "org_id": "...",
  "workspace_id": "...",
  "service_id": "...",
  "version_id": "...",
  "endpoint_id": "...",
  "endpoint_key": "post__analyze",
  "execution_mode": "invocation",
  "lane": "standard",
  "status": "completed",
  "idempotency_key": "batch-42",
  "execution_id": null,
  "attempt_count": 1,
  "result_summary": {
    "result": {"entities": ["..."]},
    "confidence": 0.95
  },
  "last_error": null,
  "pre_dispatch_age_seconds": 0.042,
  "stalled": false,
  "stalled_reason": null,
  "started_at": "2025-03-15T10:00:01Z",
  "completed_at": "2025-03-15T10:00:15Z",
  "expires_at": null,
  "created_at": "2025-03-15T10:00:00Z",
  "updated_at": "2025-03-15T10:00:15Z"
}

Priority Lanes

Jobs are assigned to priority lanes that control processing order and retry behavior. Specify the lane with the X-HSL-Lane header:

Lane Max Retries Ack Timeout Use case
urgent 5 30s Time-sensitive operations
standard 3 120s Normal workloads (default)
bulk 3 300s Batch processing, large data
curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/process" \
  -H "X-HSL-Lane: urgent" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"priority": "high", "data": "..."}'

Each lane has independent concurrency and retry settings, so urgent jobs are not blocked behind a backlog of bulk work.

Idempotency

Prevent duplicate job submissions by including an idempotency key:

curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/process" \
  -H "X-Idempotency-Key: import-batch-2025-03-15" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"batch_id": "2025-03-15"}'

If a job with the same idempotency key already exists for the organization, the platform returns the existing job instead of creating a duplicate. Idempotency keys are scoped per organization.

Job Cancellation

Request cancellation of a running or queued job:

resp = httpx.post(
    f"{api_url}/api/v1/orgs/{org_id}/services/{service_id}/jobs/{job_id}/cancel",
    headers={"Authorization": f"Bearer {token}"},
)
print(f"Cancel result: {resp.json()}")
curl -X POST \
  "$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/jobs/$JOB_ID/cancel" \
  -H "Authorization: Bearer $TOKEN"

Cancellation is best-effort

If the job is already in the processing state, the platform requests cancellation but the running code may complete before the signal is received. Jobs in completed, failed, or cancelled states cannot be cancelled and return 409 Conflict.

Cancellation flow

  1. The job status is set to cancel_requested.
  2. At the next check point in the dispatcher, the cancellation signal is detected.
  3. If the job has a child execution (persistent mode), the child execution is also cancelled.
  4. The job transitions to cancelled.

Retry and Failure Handling

When a job fails, the dispatcher follows the lane's retry policy:

  1. First failure -- job status set to retrying, message re-enqueued for retry.
  2. Subsequent failures -- same process, with attempt_count incrementing.
  3. Max retries exhausted -- job moves to failed status and an alert is generated.

The error is captured in the job's last_error field:

{
  "last_error": {
    "code": "max_retries_exhausted",
    "message": "Platform execution failed after retries"
  }
}

Failed jobs are not automatically retried. They require manual investigation through the job listing API or platform UI.

Listing Jobs

List all jobs for a service:

curl "$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/jobs?offset=0&limit=50" \
  -H "Authorization: Bearer $TOKEN"

Response:

{
  "items": [
    {"id": "...", "status": "completed", "lane": "standard", "...": "..."},
    {"id": "...", "status": "processing", "lane": "urgent", "...": "..."}
  ],
  "total": 42
}

Job Authentication

Async jobs inherit the authentication context of the original submission. When polling or cancelling a job:

  • Platform auth endpoints -- the same user (by user_id) who submitted the job must poll/cancel it.
  • Service token endpoints -- the same service token (matching custom claims) must be used.
  • No-auth endpoints -- anyone can poll the job.

This ensures that job results are only accessible to the caller who submitted the work.