Model Components¶
A model component configures an LLM for use by agents. It specifies the provider, model identifier, inference parameters, fallback chain, cost tracking, and AI Bill of Materials (AIBOM) metadata.
Creating a Model¶
Coming Soon
The Python SDK for local development is not yet publicly available.
from flow_sdk.cli_client import CLIClient
client = CLIClient(config)
model = client.models.create({
"name": "GPT-4o",
"slug": "gpt-4o",
"description": "OpenAI GPT-4o for general-purpose agent tasks",
"model": "openai/gpt-4o",
"provider": "openai",
"temperature": 0.7,
"max_tokens": 4096,
"top_p": 1.0,
"streaming": True,
"timeout_seconds": 120,
"tags": ["production", "general"],
})
Navigate to Components > Models > New. Select provider and model from dropdowns, configure temperature, max tokens, streaming. Click Create.
Configuration¶
Required Fields¶
| Field | Type | Description |
|---|---|---|
name |
string | Display name |
slug |
string | URL-safe identifier |
model |
string | Model identifier in provider/model-name format (e.g., openai/gpt-4o) |
provider |
string | Provider name (e.g., openai, anthropic, google, mistral) |
Inference Parameters¶
| Field | Type | Default | Description |
|---|---|---|---|
temperature |
float | null |
Sampling temperature (0.0 - 2.0). Lower = more deterministic |
max_tokens |
int | null |
Maximum tokens in the response |
top_p |
float | null |
Nucleus sampling threshold (0.0 - 1.0) |
streaming |
bool | false |
Enable streaming responses |
timeout_seconds |
int | null |
Request timeout (1 - 600 seconds) |
parameters |
dict | {} |
Additional provider-specific parameters |
Model Modes¶
The mode field indicates what kind of task the model supports:
| Mode | Description |
|---|---|
chat |
Conversational text generation (default for most models) |
embedding |
Text embedding / vector generation |
completion |
Text completion (non-chat) |
image_generation |
Image generation from text prompts |
Coming Soon
The Python SDK for local development is not yet publicly available.
embedding_model = client.models.create({
"name": "Text Embeddings",
"slug": "text-embeddings",
"model": "openai/text-embedding-3-small",
"provider": "openai",
"mode": "embedding",
})
AI Gateway Routing¶
Models route through the platform's AI Gateway by default, which provides rate limiting, cost tracking, and usage analytics. The gateway uses LiteLLM for multi-provider support.
| Field | Type | Default | Description |
|---|---|---|---|
use_proxy |
bool | true |
Route through the AI Gateway |
fallback_to_direct |
bool | true |
Fall back to direct API calls if the gateway is unavailable |
litellm_proxy_url |
string | null |
Custom LiteLLM proxy URL (overrides platform default) |
litellm_proxy_key |
string | null |
Custom LiteLLM proxy API key |
Direct mode
Set use_proxy=false to bypass the AI Gateway and call the provider API directly. This skips platform-level rate limiting and cost tracking, but can be useful for self-hosted models.
Fallback Chains¶
Define fallback models that the platform uses when the primary model is unavailable or rate-limited:
Coming Soon
The Python SDK for local development is not yet publicly available.
# Primary model
primary = client.models.create({
"name": "Claude Sonnet",
"slug": "claude-sonnet",
"model": "anthropic/claude-sonnet-4-20250514",
"provider": "anthropic",
})
# Fallback model
fallback = client.models.create({
"name": "GPT-4o Mini (Fallback)",
"slug": "gpt-4o-mini-fallback",
"model": "openai/gpt-4o-mini",
"provider": "openai",
})
# Set up the chain
primary = client.models.update(primary.id, {
"fallback_refs": [fallback.id],
})
The platform tries models in order: primary first, then each fallback. If all models fail, the execution fails with the last error.
Cost Tracking¶
Track model costs at the component level:
| Field | Type | Description |
|---|---|---|
cost_per_1k_input_tokens |
float | Cost per 1,000 input tokens (USD) |
cost_per_1k_output_tokens |
float | Cost per 1,000 output tokens (USD) |
Coming Soon
The Python SDK for local development is not yet publicly available.
model = client.models.create({
"name": "GPT-4o",
"slug": "gpt-4o",
"model": "openai/gpt-4o",
"provider": "openai",
"cost_per_1k_input_tokens": 0.0025,
"cost_per_1k_output_tokens": 0.01,
})
These values are used by the platform's observability system to calculate per-execution and per-agent cost breakdowns.
Model Versioning¶
Models follow the same versioning system as all components. Create a new version when you change the model identifier or parameters:
Coming Soon
The Python SDK for local development is not yet publicly available.
new_version = client.models.create_version(model.id, {
"version_increment": "minor",
"updates": {
"model": "openai/gpt-4o-2025-01-01",
"temperature": 0.5,
},
"change_summary": "Updated to January 2025 snapshot, lowered temperature",
})
AIBOM Metadata¶
Model components support AI Bill of Materials (AIBOM) fields for compliance and governance. These fields document the model's provenance, training data, safety evaluations, and environmental impact.
| Field | Description |
|---|---|
base_model |
Parent model this was fine-tuned from |
model_family |
Architecture family (e.g., transformer, diffusion) |
architecture_type |
Specific architecture (e.g., GPT, LLaMA, BERT) |
model_size_parameters |
Parameter count (e.g., 7B, 70B) |
quantization |
Quantization level (e.g., 4-bit, 8-bit, fp16) |
input_modalities |
Accepted inputs: text, image, audio |
output_modalities |
Generated outputs: text, code, image |
source_url |
Model card or HuggingFace link |
author |
Model creator |
model_license |
License (e.g., Apache-2.0, proprietary) |
training_datasets |
Training data sources |
training_date |
When the model was trained |
intended_use |
Intended use cases |
out_of_scope_uses |
Uses not recommended |
carbon_footprint_kg |
Training CO2 emissions |
energy_consumption_kwh |
Training energy consumption |
safety_evaluations |
Safety benchmark results |
bias_evaluation |
Bias evaluation results |
Coming Soon
The Python SDK for local development is not yet publicly available.
model = client.models.update(model.id, {
"model_family": "transformer",
"architecture_type": "GPT",
"model_size_parameters": "200B",
"input_modalities": ["text", "image"],
"output_modalities": ["text"],
"model_license": "proprietary",
"intended_use": "General-purpose text generation and analysis",
"source_url": "https://platform.openai.com/docs/models/gpt-4o",
})
Compliance use case
AIBOM metadata supports compliance requirements like the EU AI Act. Populate these fields for models used in regulated environments to maintain a complete audit trail of AI model provenance.