Vector Search¶

Vector search enables semantic queries over dataset records using embeddings. Instead of matching exact keywords, vector search finds records that are similar in meaning to a query -- powering RAG (Retrieval-Augmented Generation) pipelines, document retrieval, recommendation systems, and deduplication.

How Vector Search Works¶

graph LR
    Query["User Query<br/>'How do I reset my password?'"] --> Embed["Embedding Model<br/>text-embedding-3-small"]
    Embed --> Vector["Query Vector<br/>[0.12, -0.34, 0.56, ...]"]
    Vector --> Search["Vector Search<br/>(cosine similarity)"]
    Index["Vector Index<br/>(stored embeddings)"] --> Search
    Search --> Results["Top-K Results<br/>(nearest neighbors)"]

The user's query is converted to a vector using an embedding model
The vector is compared against stored embeddings using a distance metric
The most similar records are returned, ranked by distance

Embeddings¶

An embedding is a fixed-length array of numbers that represents the semantic meaning of text. Similar texts produce similar embeddings, which enables similarity-based retrieval.

Embedding Models¶

Manifest Platform supports embedding generation through the AI Gateway. Common models:

Model	Dimensions	Best For
`text-embedding-3-small`	1536	General-purpose, cost-effective
`text-embedding-3-large`	3072	Higher accuracy, larger index
`text-embedding-ada-002`	1536	Legacy compatibility

Model consistency

Always use the same embedding model for indexing and querying. Mixing models produces incompatible vectors and meaningless similarity scores.

Creating Vector-Enabled Datasets¶

To use vector search, your dataset needs a column that stores embedding vectors. You can populate this column through connector sources or by generating embeddings from text fields.

Schema with Vector Field¶

Coming Soon

The Python SDK for local development is not yet publicly available.

from flow_sdk.cli_client import CLIClient

client = CLIClient(config)

dataset = client.datasets.create({
    "name": "Knowledge Base",
    "slug": "knowledge-base",
    "description": "Support articles with vector embeddings for semantic search",
    "tags": ["rag", "knowledge-base"],
})

# Note: Schema definitions (e.g., vector fields like "embedding: vector(1536)")
# are registered separately via the dataset schema API, not inline at creation time.

Storing Vectors via Connector Write Jobs¶

Use connector write jobs to insert records with embeddings:

result = sdk.connector_instances.submit_write_job(
    instance_id="uuid-of-postgres-instance",
    scope_type="organization",
    scope_id="uuid-of-org",
    operations=[
        {
            "kind": "vector",
            "index_id": "uuid-of-vector-index",
            "vectors": [
                {
                    "id": "article-001",
                    "values": [0.12, -0.34, 0.56, ...],  # 1536 dimensions
                    "metadata": {
                        "title": "How to Reset Your Password",
                        "category": "account",
                        "content": "To reset your password, go to Settings..."
                    }
                },
                {
                    "id": "article-002",
                    "values": [0.23, -0.45, 0.67, ...],
                    "metadata": {
                        "title": "Billing FAQ",
                        "category": "billing",
                        "content": "We accept all major credit cards..."
                    }
                }
            ]
        }
    ],
)

print(f"Write job submitted: {result.job_id}")
print(f"Poll status at: {result.poll_url}")

Querying with Vector Search¶

Semantic Similarity Search¶

Find records semantically similar to a query vector:

result = sdk.datasets.query(
    dataset_id="uuid-of-dataset",
    vector_field="embedding",
    vector_query_embedding=[0.12, -0.34, 0.56, ...],  # query vector
    vector_top_k=10,
    vector_distance_metric="cosine",
)

for row in result.rows:
    print(f"  {row['title']} (distance: {row.get('_distance', 'N/A')})")

Distance Metrics¶

Metric	Description	When to Use
`cosine`	Cosine similarity (1 - cosine distance)	Default. Best for normalized embeddings
`l2`	Euclidean (L2) distance	When magnitude matters
`inner_product`	Dot product	When vectors are not normalized

Combining Vector and Scalar Filters¶

You can filter results using both vector similarity and traditional field filters:

result = sdk.datasets.query(
    dataset_id="uuid-of-dataset",
    # Vector search
    vector_field="embedding",
    vector_query_embedding=[0.12, -0.34, 0.56, ...],
    vector_top_k=20,
    vector_distance_metric="cosine",
    # Scalar filter -- only search within "billing" category
    filters=[
        {"field": "category", "operator": "eq", "value": "billing"},
    ],
)

This first filters by category = "billing", then ranks the matching records by vector similarity.

Vector-Based Sorting¶

Sort results by distance to a target vector using the vector_sort parameter:

result = sdk.datasets.query(
    dataset_id="uuid-of-dataset",
    sort_by="embedding",
    sort_order="asc",           # asc = nearest first
    vector_sort={
        "target_vector": [0.12, -0.34, 0.56, ...],
        "distance_metric": "cosine",
    },
    limit=10,
)

Distance-Based Filtering¶

Filter records by their distance from a target vector:

result = sdk.datasets.query(
    dataset_id="uuid-of-dataset",
    filters=[
        {
            "field": "embedding",
            "operator": "lte",
            "value": 0.3,
            "vector_distance": {
                "target_vector": [0.12, -0.34, 0.56, ...],
                "distance_metric": "cosine",
            },
        }
    ],
)

This returns only records within a cosine distance of 0.3 from the target vector.

Use Cases¶

RAG (Retrieval-Augmented Generation)¶

The most common vector search use case. An agent retrieves relevant context from a knowledge base before generating a response.

graph LR
    Q["User Question"] --> E1["Embed Query"]
    E1 --> VS["Vector Search<br/>Knowledge Base"]
    VS --> Context["Top-K Documents"]
    Context --> Agent["Agent"]
    Q --> Agent
    Agent --> Answer["Grounded Answer"]

A typical RAG flow:

User asks a question
The question is embedded using the same model as the index
Vector search retrieves the 5-10 most relevant documents
Retrieved documents are included in the agent's context
The agent generates an answer grounded in the retrieved content

Semantic Deduplication¶

Find near-duplicate records by searching for vectors with very high similarity:

# For each record, search for similar records
result = sdk.datasets.query(
    dataset_id="uuid-of-dataset",
    vector_field="embedding",
    vector_query_embedding=record_embedding,
    vector_top_k=5,
    vector_distance_metric="cosine",
)

# Records with distance < 0.05 are likely duplicates
duplicates = [r for r in result.rows if r["_distance"] < 0.05 and r["id"] != record_id]

Recommendation¶

Find items similar to a user's preferences by embedding user behavior and comparing against item embeddings:

# Embed the user's recent interactions
user_vector = embed(user_interaction_history)

# Find similar items
result = sdk.datasets.query(
    dataset_id="product-catalog",
    vector_field="product_embedding",
    vector_query_embedding=user_vector,
    vector_top_k=20,
    filters=[
        {"field": "in_stock", "operator": "eq", "value": True},
    ],
)

Vector API¶

The vector index API provides direct REST access for managing indexes and vectors, independent of the dataset query layer. All paths are under /orgs/{org_id}/vector.

Index Endpoints¶

Method	Path	Description
`POST`	`/vector/indexes`	Create a new vector index
`GET`	`/vector/indexes`	List indexes (paginated)
`GET`	`/vector/indexes/{index_id}`	Get index details and metadata
`DELETE`	`/vector/indexes/{index_id}`	Delete an index and all its vectors

Create Index¶

import httpx

response = httpx.post(
    "https://api.flow.marut.cloud/api/v1/orgs/{org_id}/vector/indexes",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "name": "knowledge-base",
        "dimensions": 1536,
        "metric": "cosine",
        "workspace_id": "uuid-of-workspace",
        "description": "Support articles for RAG",
        "embedding_model": "text-embedding-3-small",
    },
)
index = response.json()
print(f"Created index: {index['id']}")

Field	Type	Required	Description
`name`	string	Yes	Index name (1-200 chars)
`dimensions`	integer	Yes	Vector dimensionality, must match embedding model (1-4096)
`metric`	string	No	Distance metric: `cosine` (default), `l2`, `inner_product`
`workspace_id`	UUID	No	Scope index to a specific workspace
`description`	string	No	Human-readable description
`embedding_model`	string	No	Embedding model used (e.g., `text-embedding-3-small`)
`metadata_schema`	object	No	JSON schema for vector metadata fields

Vector Endpoints¶

Method	Path	Description
`POST`	`/vector/indexes/{index_id}/upsert`	Insert or update vectors
`POST`	`/vector/indexes/{index_id}/search`	Semantic similarity search
`POST`	`/vector/indexes/{index_id}/delete`	Delete vectors by ID

Upsert Vectors¶

response = httpx.post(
    f"https://api.flow.marut.cloud/api/v1/orgs/{org_id}/vector/indexes/{index_id}/upsert",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "vectors": [
            {
                "id": "article-001",
                "embedding": [0.12, -0.34, 0.56, ...],  # 1536 floats
                "content": "To reset your password, go to Settings...",
                "metadata": {"category": "account", "title": "Password Reset"},
            },
        ]
    },
)
print(response.json())  # {"upserted": 1}

Field	Type	Required	Description
`vectors`	array	Yes	List of vector objects
`vectors[].id`	string	No	Stable identifier; auto-generated if omitted
`vectors[].embedding`	float[]	Yes	The embedding vector (must match index dimensions)
`vectors[].content`	string	No	Source text associated with the vector
`vectors[].metadata`	object	No	Arbitrary key-value metadata for filtering

Search Vectors¶

response = httpx.post(
    f"https://api.flow.marut.cloud/api/v1/orgs/{org_id}/vector/indexes/{index_id}/search",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "query_embedding": [0.12, -0.34, 0.56, ...],
        "top_k": 5,
        "filter_metadata": {"category": "account"},
    },
)
for result in response.json()["results"]:
    print(f"  {result['id']} (distance: {result['distance']})")

Field	Type	Required	Description
`query_embedding`	float[]	Yes	The query vector to search against
`top_k`	integer	No	Number of nearest neighbors to return (default: 10, max: 1000)
`filter_metadata`	object	No	Metadata key/value pairs to pre-filter candidates before ranking

Delete Vectors¶

httpx.post(
    f"https://api.flow.marut.cloud/api/v1/orgs/{org_id}/vector/indexes/{index_id}/delete",
    headers={"Authorization": f"Bearer {token}"},
    json={"vector_ids": ["article-001", "article-002"]},
)

Best Practices¶

Embedding Quality¶

Chunk text appropriately -- For long documents, split into paragraphs or sections (300-500 tokens each). Embedding an entire 10-page document into one vector loses detail.
Include metadata in chunks -- Prepend the document title or section header to each chunk before embedding for better retrieval.
Normalize consistently -- Use the same preprocessing (lowercasing, whitespace normalization) during both indexing and querying.

Index Performance¶

Choose dimensions wisely -- text-embedding-3-small (1536 dims) is a good default. Only use text-embedding-3-large (3072 dims) if you need higher accuracy and can afford the storage/latency cost.
Use top_k judiciously -- Retrieving 5-10 results is usually sufficient for RAG. Larger top_k values increase latency without proportional quality improvement.
Combine with scalar filters -- Pre-filter on metadata fields (category, date, status) before vector ranking to reduce the search space and improve relevance.

Cost Management¶

Cache embeddings -- Embedding the same text repeatedly wastes API calls. Store computed embeddings alongside the source text.
Batch embedding requests -- When indexing many documents, batch them into single API calls (most providers support this).
Monitor index size -- Each vector consumes storage proportional to its dimensionality. A million 1536-dimension vectors uses roughly 6 GB of storage.

Vector dimensions must match

All vectors in an index must have the same number of dimensions. You cannot mix embeddings from different models in the same vector field. If you switch embedding models, you must re-embed all existing records.