Infrastructure Configuration

Detailed reference for every orchid.yml property — runtime, auth, LLM, RAG, storage, MCP, tracing.

Index

agents

llm

auth

startup

rag

cli_rag (optional — CLI-specific RAG override)

upload

storage

config_storage

mcp_auth

checkpointer

tracing


This page documents every configuration property in depth. For a quick, searchable table of all keys, see the Configuration Atlas.

Each section below covers one config domain. Properties are grouped by file — orchid.yml for infrastructure, agents.yaml for agent behavior. Every entry includes a short description, a detailed explanation, the default value, available values where applicable, caveats that affect performance or security, and examples in both YAML and Markdown formats.


orchid.yml — Infrastructure Configuration

Runtime and infrastructure keys live in orchid.yml. In Markdown mode these are the frontmatter of orchid.md. Every key maps to a flat environment variable via YAML_TO_ENV. When the same variable is set in both the config file and the environment, the environment wins.

agents.config_path

ShortPath to the agents configuration file or directory.
DetailedControls where Orchid loads agent definitions from. In YAML mode this is agents.yaml. In hybrid mode (YAML infrastructure + Markdown agents) this points to a directory like agents/ containing *.md files.
Defaultagents.yaml
Available valuesAny file path (agents.yaml, agents.yml) or directory path (agents/)
Env varAGENTS_CONFIG_PATH
# orchid.yml
agents:
  config_path: agents.yaml
---
# orchid.md frontmatter
agents:
  config_path: agents/
---

llm.model

ShortDefault LLM model for all completions.
DetailedThe model string in LiteLLM provider/model-name format. This is the primary model used for agent reasoning, summarisation, and routing when no per-agent or per-supervisor override is set.
Defaultollama/llama3.2
Available valuesAny LiteLLM-compatible string: gemini/gemini-2.5-flash, openai/gpt-4o, anthropic/claude-sonnet-4-20250514, groq/llama-3.3-70b-versatile, ollama/llama3.2
Env varLITELLM_MODEL

Production deployments

Always pair a cloud-hosted primary with a fallback_model so outages degrade gracefully instead of surfacing errors to users.

llm:
  model: gemini/gemini-2.5-flash
---
llm:
  model: openai/gpt-4o
---

llm.ollama_api_base

ShortBase URL for the local Ollama server.
DetailedThe HTTP endpoint where Ollama serves models. Only consulted when the model string starts with ollama/.
Defaultnull (uses Ollama default http://localhost:11434)
Available valuesAny HTTP URL
Env varOLLAMA_API_BASE
llm:
  ollama_api_base: http://host.docker.internal:11434
---
llm:
  ollama_api_base: http://ollama:11434
---

llm.openai_api_key

ShortAPI key for OpenAI models.
DetailedRequired when using OpenAI-hosted models or OpenAI embedding models. Injected via environment variable — never store in config files.
Defaultnull
Env varOPENAI_API_KEY
llm:
  openai_api_key: "${OPENAI_API_KEY}"

llm.anthropic_api_key

ShortAPI key for Anthropic Claude models.
DetailedRequired when using Claude models via LiteLLM.
Defaultnull
Env varANTHROPIC_API_KEY

llm.gemini_api_key

ShortAPI key for Google Gemini models.
DetailedRequired when using Gemini models or Gemini embeddings.
Defaultnull
Env varGEMINI_API_KEY

llm.groq_api_key

ShortAPI key for Groq-hosted models.
DetailedRequired when routing through Groq's inference API.
Defaultnull
Env varGROQ_API_KEY

auth.dev_bypass

ShortSkip all authentication — for local development only.
DetailedWhen enabled, the identity resolver is bypassed entirely. Every request is treated as authenticated with a default identity. This removes all authorization checks.
Defaultfalse
Available valuestrue, false
Env varDEV_AUTH_BYPASS

Security

Never set this in production, staging, or any deployed environment. Any caller can impersonate any user. Only use on a trusted local development machine.

auth:
  dev_bypass: false

auth.identity_resolver_class

ShortDotted import path to an OrchidIdentityResolver subclass.
DetailedThe resolver extracts OrchidAuthContext (tenant key, user ID, bearer token) from incoming requests. The default resolver validates JWTs against the configured auth.domain. Custom implementations can integrate with any identity provider.
Defaultnull
Available valuesAny dotted Python path to an OrchidIdentityResolver subclass
Env varIDENTITY_RESOLVER_CLASS
auth:
  identity_resolver_class: myapp.auth.CustomIdentityResolver
---
auth:
  identity_resolver_class: myapp.auth.CustomIdentityResolver
---

auth.auth_config_provider_class

ShortDotted import path to an OrchidAuthConfigProvider subclass.
DetailedEnables the GET /auth-info endpoint which exposes OAuth client metadata, scopes, and endpoints to the frontend. The provider generates the JSON payload dynamically from runtime configuration.
Defaultnull
Env varAUTH_CONFIG_PROVIDER_CLASS

auth.auth_exchange_client_class

ShortDotted import path to an OrchidAuthExchangeClient subclass.
DetailedEnables the POST /auth/exchange-code endpoint for OAuth authorization-code exchange. Used when the frontend needs to trade an auth code for tokens via the Orchid API rather than directly against the IdP.
Defaultnull
Env varAUTH_EXCHANGE_CLIENT_CLASS

auth.domain

ShortDefault domain used for identity resolution.
DetailedThe OAuth or identity-provider domain. Used by the default identity resolver to construct authorization URLs and validate tokens.
Defaultnull
Env varAUTH_DOMAIN

auth.oauth_client_id_env

ShortName of the environment variable holding the public OAuth client_id.
DetailedRather than hardcoding the client ID, the config references an env var name. This keeps the client ID out of version control while still making it discoverable via GET /auth-info.
Defaultnull
Env varAUTH_OAUTH_CLIENT_ID_ENV

auth.oauth_scope

ShortAdvertised OAuth scope for downstream clients.
DetailedThe scope string returned by GET /auth-info so the frontend knows what permissions to request during the OAuth flow.
Defaultnull
Env varAUTH_OAUTH_SCOPE

startup.hook

ShortDotted import path to a startup hook function.
DetailedCalled once after the LangGraph is initialised and before the API starts serving requests. Use this for one-time setup: warming caches, registering custom tools, or running database migrations. The function signature must be async def hook(app_context) -> None.
Defaultnull
Available valuesAny dotted Python path to an async callable
Env varSTARTUP_HOOK
startup:
  hook: myapp.bootstrap.on_startup

rag.vector_backend

ShortVector database backend type.
DetailedThe persistence layer for embeddings and vector search. Currently only Qdrant is supported. Setting this to null disables vector storage entirely.
Defaultqdrant
Available valuesqdrant, null
Env varVECTOR_BACKEND
rag:
  vector_backend: qdrant

rag.qdrant_url

ShortQdrant server URL.
DetailedThe HTTP/gRPC endpoint for the Qdrant vector database. The default assumes a Docker Compose network where Qdrant runs in a service named qdrant.
Defaulthttp://qdrant:6333
Env varQDRANT_URL

rag.embedding_model

ShortEmbedding model for document vectorisation.
DetailedThe model string in LiteLLM format. Determines the vector dimensionality and quality of semantic search. Changing this model requires re-indexing all documents because dimensions differ.
Defaulttext-embedding-3-small
Available valuestext-embedding-3-small (1536-d), nomic-embed-text (768-d), gemini-embedding-001 (3072-d), any LiteLLM embedding model
Env varEMBEDDING_MODEL

Switching embedding models

Changing the embedding model changes the vector dimensionality. Existing Qdrant collections must be dropped and all documents re-ingested. Do not change this on a live production instance without planning a migration window.

rag:
  embedding_model: nomic-embed-text

rag.openai_api_key

ShortOpenAI API key used by the embedding model.
DetailedRequired when embedding_model is an OpenAI model (e.g. text-embedding-3-small). Can be the same as llm.openai_api_key.
Defaultnull
Env varOPENAI_API_KEY

rag.gemini_api_key

ShortGoogle AI API key used by the embedding model.
DetailedRequired when embedding_model is a Gemini embedding model.
Defaultnull
Env varGEMINI_API_KEY

cli_rag (optional)

CLI-specific RAG override. When present, orchid-cli uses these values instead of rag:. The API ignores this section entirely. This allows Docker-based examples (with rag.vector_backend: qdrant) to run locally via the CLI without requiring Qdrant infrastructure.

Same keys as rag::

KeyTypeDefaultEnv Var
cli_rag.vector_backendstring(inherits from rag:)VECTOR_BACKEND
cli_rag.qdrant_urlstring(inherits from rag:)QDRANT_URL
cli_rag.embedding_modelstring(inherits from rag:)EMBEDDING_MODEL
cli_rag.openai_api_keystring(inherits from rag:)OPENAI_API_KEY
cli_rag.gemini_api_keystring(inherits from rag:)GEMINI_API_KEY
rag:
  vector_backend: qdrant           # used by orchid-api (Docker)
  qdrant_url: http://qdrant:6333
  embedding_model: gemini/gemini-embedding-001

cli_rag:
  vector_backend: chroma           # used by orchid-cli (local, on-disk)
  embedding_model: ollama/nomic-embed-text

Precedence

CLI args > env vars > cli_rag: (if present) > rag: > CLI defaults (chroma).


upload.vision_model

ShortVision model for image and PDF OCR.
DetailedThe model used to extract text from images and PDF pages. If unset, image uploads are rejected. Must be a vision-capable model in LiteLLM format.
Defaultnull
Available valuesAny vision-capable LiteLLM model: ollama/minicpm-v, gemini/gemini-2.5-flash, openai/gpt-4o
Env varVISION_MODEL
upload:
  vision_model: ollama/minicpm-v

upload.namespace

ShortQdrant namespace for uploaded documents.
DetailedThe collection or namespace where uploaded files are indexed. Kept separate from agent-specific RAG namespaces so uploads do not collide with programmatic ingestion.
Defaultuploads
Env varUPLOAD_NAMESPACE

upload.max_size_mb

ShortMaximum upload file size in megabytes.
DetailedFiles larger than this are rejected at the API layer before parsing begins.
Default20
Env varUPLOAD_MAX_SIZE_MB

upload.chunk_size

ShortDefault text chunk size in characters.
DetailedDocuments are split into chunks of this size before embedding. Larger chunks preserve more context per chunk but reduce granularity in retrieval.
Default1000
Env varCHUNK_SIZE

upload.chunk_overlap

ShortCharacter overlap between consecutive chunks.
DetailedOverlap ensures that sentences or concepts at chunk boundaries are not split. A good rule of thumb is 10–20% of chunk size.
Default200
Env varCHUNK_OVERLAP

Chunk sizing

Larger chunk_size (2000–4000) improves retrieval coherence for long-form documents but increases embedding cost and storage. Smaller sizes (500–1000) improve precision for keyword-sparse queries but may fragment related concepts. Always pair chunk size with chunk_overlap of 10–20%.


storage.class

ShortDotted import path to an OrchidChatStorage subclass.
DetailedThe persistence backend for chat sessions and messages. The built-in SQLite backend is sufficient for single-process deployments. For multi-replica API deployments, switch to PostgreSQL or a custom backend so all instances share state.
Defaultorchid_ai.persistence.sqlite.OrchidSQLiteChatStorage
Available valuesorchid_ai.persistence.sqlite.OrchidSQLiteChatStorage, orchid_ai.persistence.postgres.OrchidPostgresChatStorage, or any dotted path to a custom subclass
Env varCHAT_STORAGE_CLASS

Single-process vs multi-replica

SQLite works for demos and CLI tools where only one process accesses the database. PostgreSQL (or any shared backend) is mandatory for horizontally-scaled API deployments. Mixing SQLite across multiple API replicas will cause data inconsistency and lost messages.

storage:
  class: orchid_ai.persistence.postgres.OrchidPostgresChatStorage
  dsn: postgresql://orchid:orchid@postgres:5432/orchid
---
storage:
  class: orchid_ai.persistence.sqlite.OrchidSQLiteChatStorage
  dsn: ~/.orchid/chats.db
---

storage.dsn

ShortDatabase connection string or file path.
DetailedFor SQLite this is a file path (supports ~ expansion). For PostgreSQL this is a postgresql:// URI.
Default~/.orchid/chats.db
Env varCHAT_DB_DSN

storage.extra_migrations_package

ShortDotted package path for consumer-supplied migrations.
DetailedIf your custom storage backend has its own Alembic migrations, reference the package here so they run alongside Orchid's built-in migrations on startup.
Defaultnull
Env varCHAT_EXTRA_MIGRATIONS_PACKAGE

config_storage.enabled

ShortEnable database-backed agent configuration store.
DetailedWhen enabled, Orchid loads agent configurations from a PostgreSQL database at startup and merges them into the YAML-loaded config. This allows runtime CRUD management of agent definitions without editing YAML files. The store is controlled declaratively — no constructor parameters needed.
Defaultfalse
Available valuestrue, false

Zero overhead when disabled

When enabled: false (the default), Orchid skips config storage entirely. No database connections are opened, no queries run, and no memory is allocated for the store.

config_storage:
  enabled: true
  class: orchid_ai.persistence.config_postgres.OrchidPostgresConfigStorage
  dsn: postgresql://orchid:orchid@postgres:5432/orchid

config_storage.class

ShortDotted import path to an OrchidConfigStorage subclass.
DetailedThe persistence backend for agent configurations. The built-in PostgreSQL implementation (OrchidPostgresConfigStorage) provides full CRUD: list_configs, get_config, upsert_config, patch_config, delete_config. Custom implementations can target any database by subclassing OrchidConfigStorage.
Default"" (empty — ignored when enabled: false)
Available valuesorchid_ai.persistence.config_postgres.OrchidPostgresConfigStorage, or any dotted path to a custom subclass

config_storage.dsn

ShortDatabase connection string for agent config storage.
DetailedFor PostgreSQL this is a postgresql:// URI. The agent_configs table is created automatically via the shared migration system when init_db() runs.
Default"" (empty — ignored when enabled: false)

YAML/DB collision

By default (strict=True), an agent name that exists in both YAML and the database causes a startup error. This prevents silent configuration conflicts. Set strict=False on merge_from_db() for deep-merge semantics where DB entries overlay YAML.


mcp_auth.token_store_class

ShortDotted import path to an OrchidMCPTokenStore subclass.
DetailedStores per-user OAuth tokens for MCP servers configured with auth.mode: oauth. The SQLite backend shares the same database file as chat storage by default.
Defaultorchid_ai.persistence.mcp_token_sqlite.OrchidSQLiteMCPTokenStore
Env varMCP_TOKEN_STORE_CLASS

mcp_auth.token_store_dsn

ShortDatabase DSN for per-user MCP OAuth tokens.
DetailedCan share the same SQLite file as chat storage or use a separate connection.
Default~/.orchid/chats.db
Env varMCP_TOKEN_STORE_DSN

mcp_auth.client_registration_store_class

ShortDotted import path to an OrchidMCPClientRegistrationStore subclass.
DetailedStores per-server OAuth endpoint metadata and dynamic client registration (DCR) credentials. Required when using MCP servers with auth.mode: oauth.
Defaultorchid_ai.persistence.mcp_client_registration_sqlite.OrchidSQLiteMCPClientRegistrationStore
Env varMCP_CLIENT_REGISTRATION_STORE_CLASS

mcp_auth.client_registration_store_dsn

ShortDatabase DSN for MCP client registration data.
Default~/.orchid/chats.db
Env varMCP_CLIENT_REGISTRATION_STORE_DSN

checkpointer.type

ShortLangGraph state persistence backend.
DetailedPersists the LangGraph state machine across restarts. Without a checkpointer, in-flight conversations lose their graph state on process restart. memory stores state in RAM (lost on restart). sqlite and postgres provide durable persistence. A dotted class path enables custom backends.
Defaultnull (disabled)
Available valuesmemory, sqlite, postgres, or any dotted Python path to a BaseCheckpointSaver subclass
Env varCHECKPOINTER_TYPE
checkpointer:
  type: sqlite
  dsn: ~/.orchid/checkpoints.db

checkpointer.dsn

ShortConnection string or file path for the checkpointer.
DetailedFor sqlite this is a file path. For postgres this is a postgresql:// URI.
Defaultnull
Env varCHECKPOINTER_DSN

State loss without checkpointer

If checkpointer.type is null and the API process restarts, all in-progress conversations lose their graph state. Users will see errors or stale responses. Always configure a durable checkpointer (sqlite minimum) for production deployments.


tracing.langsmith_tracing

ShortEnable LangSmith tracing.
DetailedSends LangGraph execution traces to LangSmith for debugging, latency analysis, and prompt inspection.
Defaultfalse
Available valuestrue, false
Env varLANGSMITH_TRACING

tracing.langsmith_api_key

ShortLangSmith API key.
DetailedRequired when langsmith_tracing is enabled. Injected via environment variable.
Defaultnull
Env varLANGSMITH_API_KEY

tracing.langsmith_project

ShortLangSmith project name.
DetailedGroups traces under a named project in the LangSmith dashboard.
Defaultagents
Env varLANGSMITH_PROJECT