Validation Components¶
A validation component defines data quality and business logic rules that can be applied to data flowing through agents, workflows, and pipelines. Validations act as gates -- checking that data meets required standards before it proceeds.
Creating a Validation¶
Coming Soon
The Python SDK for local development is not yet publicly available.
from flow_sdk.cli_client import CLIClient
client = CLIClient(config)
validation = client.validations.create({
"name": "Email Format Check",
"slug": "email-format-check",
"description": "Validates that email addresses match the expected format",
"validator_ref": "app.validators.check_email_format",
"rule_type": "pattern",
"field": "email",
"config": {
"pattern": r"^[\w.-]+@[\w.-]+\.\w+$",
"message": "Invalid email format",
},
"severity": "error",
"tags": ["data-quality", "email"],
"examples": [
{"input": {"email": "alice@example.com"}, "expected": "pass"},
{"input": {"email": "not-an-email"}, "expected": "fail"},
],
})
Navigate to Components > Validations > New. Select rule type, configure field and severity. Click Create.
Configuration¶
Required Fields¶
| Field | Type | Description |
|---|---|---|
name |
string | Display name |
slug |
string | URL-safe identifier |
validator_ref |
string | Python validator function path |
Optional Fields¶
| Field | Type | Default | Description |
|---|---|---|---|
rule_type |
enum | custom |
Validation rule type (see below) |
field |
string | null |
Specific field to validate |
config |
dict | {} |
Rule-type-specific configuration |
severity |
enum | error |
Severity level: error, warning, info |
threshold |
float | null |
Pass rate threshold (0.0 - 1.0) for batch validations |
examples |
list | [] |
Input/expected-result examples |
Rule Types¶
The platform provides built-in rule types for common validation patterns:
| Rule Type | Description | Config Fields |
|---|---|---|
not_null |
Field must not be null or empty | field |
unique |
Values must be unique within a dataset | field, scope |
range |
Numeric value must fall within a range | min, max, inclusive |
pattern |
String must match a regex pattern | pattern, message |
schema_validation |
Data must conform to a Schema | schema_ref |
custom |
Custom Python validation logic | Any custom config |
Not Null¶
Coming Soon
The Python SDK for local development is not yet publicly available.
validation = client.validations.create({
"name": "Required Fields Check",
"slug": "required-fields",
"validator_ref": "app.validators.not_null",
"rule_type": "not_null",
"field": "customer_id",
"severity": "error",
})
Range¶
Coming Soon
The Python SDK for local development is not yet publicly available.
validation = client.validations.create({
"name": "Age Range Check",
"slug": "age-range",
"validator_ref": "app.validators.check_range",
"rule_type": "range",
"field": "age",
"config": {
"min": 0,
"max": 150,
"inclusive": True,
},
"severity": "error",
})
Pattern¶
Coming Soon
The Python SDK for local development is not yet publicly available.
validation = client.validations.create({
"name": "Phone Number Format",
"slug": "phone-format",
"validator_ref": "app.validators.check_pattern",
"rule_type": "pattern",
"field": "phone",
"config": {
"pattern": r"^\+\d{10,15}$",
"message": "Phone must be in E.164 format (e.g., +15551234567)",
},
"severity": "warning",
})
Schema Validation¶
Validate an entire data structure against a Schema component:
Coming Soon
The Python SDK for local development is not yet publicly available.
validation = client.validations.create({
"name": "Profile Schema Check",
"slug": "profile-schema-check",
"validator_ref": "app.validators.validate_schema",
"rule_type": "schema_validation",
"config": {
"schema_ref": str(customer_profile_schema.id),
},
"severity": "error",
})
Custom Validators¶
For business logic that goes beyond built-in rule types, write a custom validator function:
# In your codebase: app/validators/business_rules.py
def check_order_total(data: dict) -> dict:
"""Validate that order total matches line item sum."""
line_items = data.get("line_items", [])
expected_total = sum(item["price"] * item["quantity"] for item in line_items)
actual_total = data.get("total", 0)
if abs(expected_total - actual_total) > 0.01:
return {
"valid": False,
"error": f"Total mismatch: expected {expected_total}, got {actual_total}",
}
return {"valid": True}
Coming Soon
The Python SDK for local development is not yet publicly available.
validation = client.validations.create({
"name": "Order Total Check",
"slug": "order-total-check",
"validator_ref": "app.validators.business_rules.check_order_total",
"rule_type": "custom",
"severity": "error",
})
Severity Levels¶
| Level | Behavior |
|---|---|
error |
Stops execution. The pipeline halts and the failure is reported. |
warning |
Logs a warning but allows execution to continue. |
info |
Informational only. Recorded in logs but does not affect execution. |
Use error for data integrity issues that would corrupt downstream results. Use warning for anomalies worth investigating but not blocking. Use info for metrics and visibility.
Threshold-Based Validation¶
For batch data processing, set a threshold to define the minimum pass rate. The validation passes if at least the specified percentage of records pass:
Coming Soon
The Python SDK for local development is not yet publicly available.
validation = client.validations.create({
"name": "Data Quality Gate",
"slug": "data-quality-gate",
"validator_ref": "app.validators.check_completeness",
"rule_type": "not_null",
"field": "email",
"threshold": 0.95, # 95% of records must have a non-null email
"severity": "error",
})
Validation in Workflows¶
Use validation components as workflow nodes to create data quality gates:
graph LR
Extract["Extract Data"]
Validate["Validate<br/>(quality gate)"]
Transform["Transform"]
Load["Load"]
Alert["Alert on Failure"]
Extract --> Validate
Validate -->|"pass"| Transform
Validate -->|"fail"| Alert
Transform --> Load
Coming Soon
The Python SDK for local development is not yet publicly available.
workflow = client.workflows.create({
"name": "Validated ETL",
"slug": "validated-etl",
"trigger_type": "schedule",
"trigger_config": {"cron_expression": "0 3 * * *"},
"nodes": [
{"id": "extract", "type": "codeblock",
"component_ref": extract_block.id, "label": "Extract"},
{"id": "validate", "type": "validation",
"component_ref": quality_gate.id, "label": "Quality Gate"},
{"id": "transform", "type": "codeblock",
"component_ref": transform_block.id, "label": "Transform"},
{"id": "load", "type": "codeblock",
"component_ref": load_block.id, "label": "Load"},
{"id": "alert", "type": "codeblock",
"component_ref": alert_block.id, "label": "Alert"},
],
"edges": [
{"source": "extract", "target": "validate"},
{"source": "validate", "target": "transform", "condition": "valid == True"},
{"source": "validate", "target": "alert", "condition": "valid == False"},
{"source": "transform", "target": "load"},
],
})