Creating Services (Beta)¶
This guide walks through creating a hosted service, configuring it, managing versions, and publishing.
Creating a Service¶
From the UI¶
- Navigate to Hosted Services in the left sidebar.
- Click New Service.
- Fill in the required fields:
- Name -- human-readable label (e.g., "Document Processor")
- Slug -- URL-safe identifier, lowercase alphanumeric with hyphens (e.g.,
document-processor). Immutable after creation. - Execution Mode --
invocation(default) orpersistent. - Click Create.
The service starts in draft status with version 1 automatically created.
From the SDK¶
Coming Soon
The Python SDK for local development is not yet publicly available.
from flow_sdk.cli_client import CLIClient
from flow_sdk.config import FlowConfig
config = FlowConfig.load()
client = CLIClient(config)
# Create via the management API
import httpx
resp = httpx.post(
f"{config.platform_url}/api/v1/orgs/{config.org_id}/workspaces/{config.workspace_id}/hosted-services",
headers={"Authorization": f"Bearer {config.access_token}"},
json={
"slug": "document-processor",
"name": "Document Processor",
"execution_mode": "invocation",
"ring_id": "uuid-of-ring", # required for invocation mode — get from Admin > Deployments > Rings
"description": "Processes and summarizes uploaded documents",
},
)
service = resp.json()
print(f"Service created: {service['id']}")
ring_id is required for invocation mode
ring_id is required when execution_mode is invocation. Get the ring ID from Admin > Deployments > Rings or via the rings API.
# Service creation is managed through the UI or direct API calls.
# Use curl with your platform credentials:
curl -X POST \
"https://api.flow.marut.cloud/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"slug": "document-processor",
"name": "Document Processor",
"execution_mode": "invocation",
"ring_id": "uuid-of-ring"
}'
Service Configuration¶
Execution Mode¶
Choose the execution mode at creation time. It can be changed later on draft services.
| Mode | Best for | Cold start | Scaling |
|---|---|---|---|
invocation |
Stateless, fast operations | None | Automatic per-request |
persistent |
Stateful, ML inference, high throughput | Container startup time | Replica-based autoscaling |
Persistent Mode Settings¶
When using persistent execution mode, you can configure deployment parameters:
{
"slug": "ml-inference",
"name": "ML Inference Service",
"execution_mode": "persistent",
"min_replicas": 1,
"max_replicas": 5,
"concurrency_per_replica": 80,
"scale_threshold": null,
"startup_timeout_seconds": 120,
"base_image": null,
"system_packages": ["libgomp1"]
}
| Field | Description | Default |
|---|---|---|
min_replicas |
Minimum running instances (0 allows scale-to-zero) | 0 |
max_replicas |
Maximum instances under load | 10 |
concurrency_per_replica |
Max concurrent requests per instance | 80 |
scale_threshold |
Custom scaling metric threshold | auto |
startup_timeout_seconds |
How long to wait for container startup | null |
base_image |
Custom Docker base image | Platform default |
system_packages |
OS packages to install in the container | [] |
Connector Bindings¶
Services can bind to platform connectors for database access, external APIs, and storage:
{
"connector_bindings": {
"primary_db": {
"connector_instance_id": "uuid-of-postgres-instance",
"role": "read_write"
},
"s3_storage": {
"connector_instance_id": "uuid-of-s3-instance",
"role": "read_write"
}
}
}
Bound connectors are available to your endpoint code at runtime through the execution context.
Environment and Ring Assignment¶
Services can be scoped to a specific environment and deployment ring:
Version Management¶
Versions are the unit of deployment for hosted services. Each version is an immutable snapshot of endpoint definitions.
Version Lifecycle¶
stateDiagram-v2
[*] --> Draft: Create version
Draft --> Active: Publish
Active --> Draining: New version published
Draining --> Archived: Traffic drained
Archived --> [*]
Creating a New Version¶
When you need to update endpoints, create a new version:
The new version is in draft status. You can add, edit, and remove endpoints on a draft version.
Publishing a Version¶
Publishing freezes the endpoint definitions into an immutable snapshot and makes the version live:
resp = httpx.post(
f"{config.platform_url}/api/v1/orgs/{org_id}/workspaces/{ws_id}"
f"/hosted-services/{service_id}/versions/{version_id}/publish",
headers={"Authorization": f"Bearer {token}"},
json={"traffic_percent": 100},
)
published = resp.json()
print(f"Version {published['version']} is now active")
Publishing is irreversible
Once a version is published, its endpoint definitions cannot be changed. To make changes, create a new version.
When a version is published, the platform:
- Freezes all endpoint definitions into the version's
endpoint_snapshot. - Generates a
skill_snapshotfor agent tool discovery. - Sets the version as the service's active version.
- Marks the previous active version as
draining.
Traffic Splitting¶
When publishing, you can control how much traffic the new version receives:
This allows canary deployments where the new version receives a percentage of traffic while the previous version handles the rest.
Deploying Persistent Services¶
Persistent-mode services require an explicit build and deploy step after publishing.
Build and Deploy¶
# Build and deploy in one step (default)
curl -X POST \
"$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/deploy" \
-H "Authorization: Bearer $TOKEN"
# Build only (without deploying)
curl -X POST \
"$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/build" \
-H "Authorization: Bearer $TOKEN"
# Deploy a pre-built image (build=false)
curl -X POST \
"$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/deploy?build=false" \
-H "Authorization: Bearer $TOKEN"
Deployment States¶
Persistent services move through deployment states:
| State | Description |
|---|---|
building |
Container image is being built |
built |
Image built, ready to deploy |
deploying |
Deploying to Cloud Run |
active |
Running and serving traffic |
failed |
Build or deploy failed |
draining |
Being replaced by a new deployment |
Checking Deployment Status¶
curl "$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/deployment-status" \
-H "Authorization: Bearer $TOKEN"
Response:
Tearing Down¶
To remove a persistent deployment (stops the running container):
curl -X POST \
"$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/teardown" \
-H "Authorization: Bearer $TOKEN"
Cloud Credentials¶
Persistent services deploy to cloud infrastructure. Credentials can be configured at two levels:
Per-Service Credentials¶
curl -X PUT \
"$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/service-credentials" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"cloud_provider": "gcp",
"cloud_credentials_json": "{...GCP service account key...}",
"cloud_project_id": "my-project",
"cloud_region": "us-central1"
}'
Organization-Level Credentials¶
Shared across all services in the organization:
curl -X PUT \
"$API_URL/api/v1/orgs/$ORG_ID/workspaces/$WS_ID/hosted-services/$SERVICE_ID/credentials" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"cloud_provider": "gcp",
"cloud_credentials_json": "{...}",
"cloud_project_id": "org-project"
}'
Credential precedence
Per-service credentials take priority over organization-level credentials. This lets you run most services on shared infrastructure while isolating specific services to dedicated projects.
Generated Client Stubs¶
The platform auto-generates typed client code for your published services:
The generated stubs include typed methods for each endpoint, matching your request/response schemas.