Async Jobs (Beta)¶
Async jobs let you submit long-running endpoint invocations without waiting for the result. The caller receives a job ID immediately and polls for completion. Jobs are processed through a managed queue with priority lanes, automatic retry, and failure handling.
When to Use Async Jobs¶
Use async invocation when:
- The endpoint processes large datasets or performs expensive computation
- You need to decouple the caller from the execution timeline
- The operation may take longer than the sync timeout (default 30 seconds)
- You want built-in retry on failure
Endpoint must opt in
Only endpoints with async_eligible: true accept async invocations. Attempting to submit an async job to a non-eligible endpoint returns 400 not_async_eligible.
Submitting a Job¶
Submit an async job by calling the /async/ route on the Service Gateway:
import httpx
resp = httpx.post(
f"{api_url}/api/v1/orgs/{org_id}/services/{service_id}/async/analyze",
headers={
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
"X-HSL-Lane": "standard", # optional: priority lane
"X-Idempotency-Key": "batch-42", # optional: prevent duplicates
},
json={"content": "Long document...", "analysis_type": "entities"},
)
job = resp.json()
print(f"Job submitted: {job['job_id']}")
print(f"Poll at: {job['poll_url']}")
# {
# "job_id": "abc-123",
# "status": "staged",
# "poll_url": "/api/v1/orgs/.../services/.../jobs/abc-123"
# }
Versioned Async Submission¶
Target a specific service version:
curl -X POST \
"$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/v2/async/analyze" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"content": "Long document..."}'
Target Method Header¶
Async submissions always use POST as the HTTP method. If the underlying endpoint uses a different method (e.g., PUT), specify it with the X-HSL-Target-Method header:
curl -X POST \
"$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/items/123" \
-H "X-HSL-Target-Method: PUT" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"status": "processed"}'
Job Lifecycle¶
stateDiagram-v2
[*] --> Staged: Job submitted
Staged --> Queued: Enqueued
Queued --> Dispatching: Picked up
Dispatching --> Processing: Execution started
Processing --> Completed: Success
Processing --> Retrying: Transient failure
Retrying --> Queued: Re-enqueued
Retrying --> Failed: Max retries exhausted
Processing --> CancelRequested: Cancel requested
Staged --> CancelRequested: Cancel requested
Queued --> CancelRequested: Cancel requested
CancelRequested --> Cancelled: Acknowledged
Failed --> [*]
Completed --> [*]
Cancelled --> [*]
Job Statuses¶
| Status | Description |
|---|---|
staged |
Job record created, not yet submitted to the queue |
queued |
Submitted to the job queue |
dispatching |
Job picked up, preparing execution |
processing |
Endpoint code is executing |
completed |
Execution finished successfully |
retrying |
Failed, will be retried |
failed |
Execution failed (non-retryable) |
failed |
All retry attempts exhausted |
cancel_requested |
Cancellation requested, waiting for acknowledgement |
cancelled |
Job cancelled |
Polling for Status¶
Poll the job status endpoint to track progress:
import time
poll_url = f"{api_url}/api/v1/orgs/{org_id}/services/{service_id}/jobs/{job_id}"
while True:
resp = httpx.get(poll_url, headers={"Authorization": f"Bearer {token}"})
job = resp.json()
print(f"Status: {job['status']}")
if job["status"] in ("completed", "failed", "cancelled"):
break
time.sleep(2) # Poll every 2 seconds
if job["status"] == "completed":
print(f"Result: {job['result_summary']}")
else:
print(f"Error: {job.get('last_error')}")
Batch Status Check¶
Check multiple jobs in a single request:
curl "$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/jobs/batch-status?job_ids=$JOB_1,$JOB_2,$JOB_3" \
-H "Authorization: Bearer $TOKEN"
Job Response Fields¶
{
"id": "abc-123",
"org_id": "...",
"workspace_id": "...",
"service_id": "...",
"version_id": "...",
"endpoint_id": "...",
"endpoint_key": "post__analyze",
"execution_mode": "invocation",
"lane": "standard",
"status": "completed",
"idempotency_key": "batch-42",
"execution_id": null,
"attempt_count": 1,
"result_summary": {
"result": {"entities": ["..."]},
"confidence": 0.95
},
"last_error": null,
"pre_dispatch_age_seconds": 0.042,
"stalled": false,
"stalled_reason": null,
"started_at": "2025-03-15T10:00:01Z",
"completed_at": "2025-03-15T10:00:15Z",
"expires_at": null,
"created_at": "2025-03-15T10:00:00Z",
"updated_at": "2025-03-15T10:00:15Z"
}
Priority Lanes¶
Jobs are assigned to priority lanes that control processing order and retry behavior. Specify the lane with the X-HSL-Lane header:
| Lane | Max Retries | Ack Timeout | Use case |
|---|---|---|---|
urgent |
5 | 30s | Time-sensitive operations |
standard |
3 | 120s | Normal workloads (default) |
bulk |
3 | 300s | Batch processing, large data |
curl -X POST \
"$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/process" \
-H "X-HSL-Lane: urgent" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"priority": "high", "data": "..."}'
Each lane has independent concurrency and retry settings, so urgent jobs are not blocked behind a backlog of bulk work.
Idempotency¶
Prevent duplicate job submissions by including an idempotency key:
curl -X POST \
"$API_URL/api/v1/orgs/$ORG_ID/services/$SERVICE_ID/async/process" \
-H "X-Idempotency-Key: import-batch-2025-03-15" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"batch_id": "2025-03-15"}'
If a job with the same idempotency key already exists for the organization, the platform returns the existing job instead of creating a duplicate. Idempotency keys are scoped per organization.
Job Cancellation¶
Request cancellation of a running or queued job:
Cancellation is best-effort
If the job is already in the processing state, the platform requests cancellation but the running code may complete before the signal is received. Jobs in completed, failed, or cancelled states cannot be cancelled and return 409 Conflict.
Cancellation flow¶
- The job status is set to
cancel_requested. - At the next check point in the dispatcher, the cancellation signal is detected.
- If the job has a child execution (persistent mode), the child execution is also cancelled.
- The job transitions to
cancelled.
Retry and Failure Handling¶
When a job fails, the dispatcher follows the lane's retry policy:
- First failure -- job status set to
retrying, message re-enqueued for retry. - Subsequent failures -- same process, with
attempt_countincrementing. - Max retries exhausted -- job moves to
failedstatus and an alert is generated.
The error is captured in the job's last_error field:
{
"last_error": {
"code": "max_retries_exhausted",
"message": "Platform execution failed after retries"
}
}
Failed jobs are not automatically retried. They require manual investigation through the job listing API or platform UI.
Listing Jobs¶
List all jobs for a service:
curl "$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/jobs?offset=0&limit=50" \
-H "Authorization: Bearer $TOKEN"
Response:
{
"items": [
{"id": "...", "status": "completed", "lane": "standard", "...": "..."},
{"id": "...", "status": "processing", "lane": "urgent", "...": "..."}
],
"total": 42
}
Job Authentication¶
Async jobs inherit the authentication context of the original submission. When polling or cancelling a job:
- Platform auth endpoints -- the same user (by
user_id) who submitted the job must poll/cancel it. - Service token endpoints -- the same service token (matching custom claims) must be used.
- No-auth endpoints -- anyone can poll the job.
This ensures that job results are only accessible to the caller who submitted the work.