Configuration Atlas
Searchable table of every configuration key across orchid.yml and agents.yaml — filter by name, type, or description.
This page provides a searchable, filterable table of every configuration key across both config formats. Click any key to jump to its detailed reference page — Infrastructure for orchid.yml keys, Agents for agents.yaml keys.
Orchid supports two configuration formats with identical capabilities. Choose the one that fits your workflow — or mix both in hybrid mode.
YAML uses two files: orchid.yml for infrastructure and agents.yaml for agent behavior. Every key maps to an environment variable; real env vars always win. This is the original format, well-suited for programmatic generation and CI pipelines.
Markdown uses orchid.md for infrastructure and agents/*.md for per-agent configs. Each agent file has YAML frontmatter for structured fields and a Markdown body that becomes the system prompt directly — no YAML string escaping or multi-line block scalars. An on-demand SHA-256 watcher detects file changes and rebuilds the graph without restarting.
Both formats produce identical OrchidAgentsConfig output. The tables below cover all keys regardless of format.
Infrastructure Config
Runtime and infrastructure keys. In YAML these live in orchid.yml; in Markdown they are the frontmatter of orchid.md. Every key maps to a flat environment variable via ORCHID_CONFIG → YAML_TO_ENV. When the same variable is set in both the config file and the environment, the environment wins.
Use config files for deployment, env vars for secrets
Check your config file into version control for structure and non-secret defaults (model names, URLs, feature flags). Inject secrets (*_API_KEY, database DSNs with passwords) exclusively via environment variables or a secrets manager — never store them in YAML or frontmatter.
auth.dev_bypass is for local development only
Setting auth.dev_bypass: true removes all authentication and authorization. Any caller can impersonate any user. Never deploy with this enabled.
| Key | Type | Req. | Default | Description | Examples |
|---|---|---|---|---|---|
| agents.config_path | string | no | agents.yaml | Path to the agents.yaml configuration file. | |
| auth.auth_config_provider_class | string | no | — | Dotted import path to an OrchidAuthConfigProvider subclass. | |
| auth.auth_exchange_client_class | string | no | — | Dotted import path to an OrchidAuthExchangeClient subclass. | |
| auth.dev_bypass | boolean | no | false | Bypass authentication — for development only; never set in production. ✓ Best practice:Only use in local development. Setting this in a deployed environment removes all authentication and authorization — any caller can impersonate any user. | |
| auth.domain | string | no | — | Default domain used for identity resolution. | |
| auth.identity_resolver_class | string | no | — | Dotted import path to an OrchidIdentityResolver subclass. | |
| auth.oauth_client_id_env | string | no | — | Name of the env var holding the public OAuth client_id. | |
| auth.oauth_scope | string | no | — | Advertised OAuth scope for downstream clients. | |
| checkpointer.dsn | string | no | — | Connection string or file path for the LangGraph checkpointer. | |
| checkpointer.type | string | no | — | LangGraph state persistence backend ('memory', 'sqlite', 'postgres', or class path). | |
| cli_rag.embedding_model | string | no | — | Maps to environment variable EMBEDDING_MODEL. | |
| cli_rag.gemini_api_key | string | no | — | Maps to environment variable GEMINI_API_KEY. | |
| cli_rag.openai_api_key | string | no | — | Maps to environment variable OPENAI_API_KEY. | |
| cli_rag.qdrant_url | string | no | — | Maps to environment variable QDRANT_URL. | |
| cli_rag.vector_backend | string | no | — | Maps to environment variable VECTOR_BACKEND. | |
| llm.anthropic_api_key | string | no | — | Anthropic API key for Claude models. | |
| llm.gemini_api_key | string | no | — | Google AI (Gemini) API key. | |
| llm.groq_api_key | string | no | — | Groq API key for Groq-hosted models. | |
| llm.model | string | no | ollama/llama3.2 | Default LLM model in LiteLLM format (e.g. 'ollama/llama3.2', 'gemini/gemini-2.5-flash'). | |
| llm.ollama_api_base | string | no | — | Ollama server base URL for local model serving. | |
| llm.openai_api_key | string | no | — | OpenAI API key. | |
| mcp_auth.client_registration_store_class | string | no | orchid_ai.persistence.mcp_client_registration_sqlite.OrchidSQLiteMCPClientRegistrationStore | Dotted import path to an OrchidMCPClientRegistrationStore subclass. | |
| mcp_auth.client_registration_store_dsn | string | no | ~/.orchid/chats.db | Database DSN for per-server MCP OAuth endpoints and DCR credentials. | |
| mcp_auth.token_store_class | string | no | orchid_ai.persistence.mcp_token_sqlite.OrchidSQLiteMCPTokenStore | Dotted import path to an OrchidMCPTokenStore subclass. | |
| mcp_auth.token_store_dsn | string | no | ~/.orchid/chats.db | Database DSN for per-user outbound MCP OAuth tokens. | |
| rag.embedding_model | string | no | text-embedding-3-small | Embedding model in LiteLLM format (e.g. 'ollama/nomic-embed-text'). | |
| rag.gemini_api_key | string | no | — | Google AI (Gemini) API key used by the embedding model. | |
| rag.openai_api_key | string | no | — | OpenAI API key used by the embedding model. | |
| rag.qdrant_url | string | no | http://qdrant:6333 | Qdrant server URL. | |
| rag.vector_backend | string | no | qdrant | Vector database backend — 'qdrant' is the only supported backend. | |
| startup.hook | string | no | — | Dotted import path to a startup hook function called after graph init. | |
| storage.class | string | no | orchid_ai.persistence.sqlite.OrchidSQLiteChatStorage | Dotted import path to an OrchidChatStorage subclass. ✓ Best practice:The built-in `OrchidSQLiteChatStorage` is sufficient for single-process deployments (demos, CLi tools). Switch to `OrchidPostgresChatStorage` (or a custom subclass) for multi-replica API deployments where all instances must share the same chat history. | |
| storage.dsn | string | no | ~/.orchid/chats.db | Database connection string for chat persistence (SQLite path or Postgres URL). | |
| storage.extra_migrations_package | string | no | — | Dotted package path for consumer-supplied DB migrations. | |
| tracing.langsmith_api_key | string | no | — | LangSmith API key. | |
| tracing.langsmith_project | string | no | agents | LangSmith project name. | |
| tracing.langsmith_tracing | boolean | no | false | Enable LangSmith tracing for debugging and observability. | |
| upload.chunk_overlap | int | no | 200 | Character overlap between consecutive chunks. | |
| upload.chunk_size | int | no | 1000 | Text chunk size in characters for document ingestion. | |
| upload.max_size_mb | int | no | 20 | Maximum upload size in megabytes. | |
| upload.namespace | string | no | uploads | Qdrant namespace for uploaded documents. | |
| upload.vision_model | string | no | — | Vision model for PDF/image OCR (e.g. 'ollama/minicpm-v'). |
Agent Behavior Config
Agent behavior and orchestration configuration. In YAML these live in agents.yaml; in Markdown they are individual agents/<name>.md files with YAML frontmatter + Markdown body. Validated against OrchidAgentsConfig at startup — unknown keys raise an error, so typos surface immediately. Defaults cascade: values set under defaults: apply to every agent unless overridden per-agent.
supervisor.history_summary_enabled
Enable sliding-window summarization for long-running chat sessions with token-priced LLMs. When enabled, the supervisor compresses older turns via a cheap LLM call and keeps only the most recent history_summary_recent_turns exchanges verbatim. Disable for short-form workflows or real-time pipelines where the summarization call adds latency with no benefit.
agents[].mcp_servers[].auth.mode
Default to none for local MCP servers that share the process or a private network — no credentials needed. Use passthrough when the MCP server trusts your bearer token (same identity provider). Use oauth when each user must independently authorize the MCP server; Orchid discovers everything from the server's 401 response via RFC 9728 and RFC 7591 — no client_id or client_secret lives in config.
defaults.llm.fallback_model
Always configure a fallback model in production environments. Pairing a cloud-hosted primary (e.g. gemini/gemini-2.5-flash) with a local Ollama model as a fallback ensures the service degrades gracefully during provider outages or rate-limit events rather than returning errors to users.
| Key | Type | Req. | Default | Description | Examples |
|---|---|---|---|---|---|
| version | string | no | 1 | Schema version — only '1' is currently supported. | |
| defaults.llm.model | string | no | gemini/gemini-2.5-flash | Default LLM model for all agents (LiteLLM format). | |
| defaults.llm.temperature | float | no | 0.2 | Sampling temperature (0.0–1.0). Lower = more deterministic. | |
| defaults.llm.fallback_model | string | no | — | Model tried automatically when the primary model fails. ✓ Best practice:Always set a fallback model in production to survive provider outages. Pair a cloud-hosted primary (e.g. `gemini/gemini-2.5-flash`) with a local Ollama model as fallback so the service degrades gracefully rather than returning errors. | |
| defaults.llm.retry_attempts | int | no | 0 | Retry count on transient LLM errors (0 = disabled). | |
| defaults.rag.k | int | no | 5 | Number of chunks retrieved per RAG query. ✓ Best practice:Raise `k` for knowledge-base agents that must surface multiple relevant documents per query (e.g. catalog search, document Q&A). Lower it for precision-focused agents to keep prompts concise and reduce hallucination from noisy context. | |
| defaults.rag.enabled | boolean | no | true | Enable RAG context retrieval for all agents. | |
| defaults.rag.rag_ttl | int | no | 0 | Cache TTL for RAG results in seconds (0 = no cache). | |
| defaults.rag.max_context_chars | int | no | 3000 | Maximum characters of RAG context injected into the system prompt. | |
| defaults.rag.ingestion.strategy | string | no | — | Default chunking strategy: 'recursive', 'semantic', 'hierarchical', 'headered'. | |
| defaults.rag.ingestion.chunk_size | int | no | 1000 | Default text chunk size in characters. | |
| defaults.rag.ingestion.chunk_overlap | int | no | 200 | Default character overlap between chunks. | |
| defaults.rag.ingestion.parent_chunk_size | int | no | 0 | Parent chunk size for hierarchical chunking (0 = disabled). | |
| defaults.rag.ingestion.parent_chunk_overlap | int | no | 200 | Overlap for parent chunks in hierarchical chunking. | |
| defaults.rag.ingestion.post_processors | list[string] | no | [] | Post-processors applied after chunking (e.g. 'contextual_headers', 'entity_extraction'). | |
| defaults.rag.retrieval.strategy | string | no | — | Default retrieval strategy: 'simple', 'multi_query', 'hyde', 'hybrid', 'graph_rag'. | |
| defaults.rag.retrieval.query_transformers | list[string] | no | — | Ordered list of query transformers (e.g. 'multi_query', 'hyde', 'reformulate'). | |
| defaults.rag.retrieval.metadata_filters | dict[string, any] | no | {} | Metadata filter expressions applied to all retrievals. | |
| defaults.rag.retrieval.exclude_dynamic | boolean | no | false | Exclude dynamically-injected tool output from retrieval results. | |
| defaults.rag.retrieval.hyde.n_hypothetical | int | no | 1 | Number of hypothetical answers generated per HyDE query. | |
| defaults.rag.retrieval.hybrid.sparse_encoder | string | no | bm25 | Sparse encoder for hybrid retrieval: 'bm25' or 'splade'. | |
| defaults.rag.retrieval.hybrid.sparse_weight | float | no | 0.4 | Weight of the sparse score in linear fusion (0.0–1.0). | |
| defaults.rag.retrieval.hybrid.fusion | "rrf" | "linear" | no | rrf | Fusion method for hybrid retrieval: 'rrf' (Reciprocal Rank Fusion) or 'linear'. | |
| defaults.rag.retrieval.hybrid.rrf_k | int | no | 60 | RRF constant k (default 60, per Cormack et al.). | |
| defaults.rag.retrieval.graph.enabled | boolean | no | false | Enable graph entity extraction and traversal for retrieval. | |
| defaults.rag.retrieval.graph.max_hops | int | no | 2 | Maximum BFS depth from seed entities during graph traversal. | |
| defaults.rag.retrieval.graph.fuse_with_vectors | boolean | no | true | Merge graph context with vector hits in the response. | |
| defaults.rag.retrieval.graph.relation_filter | list[string] | no | [] | Restrict graph traversal to these edge labels (empty = all). | |
| defaults.rag.retrieval.transformer_prompts.multi_query | string | no | — | Override prompt for the multi-query query transformer. | |
| defaults.rag.retrieval.transformer_prompts.hyde.single | string | no | — | HyDE prompt for a single hypothetical answer (n_hypothetical=1). | |
| defaults.rag.retrieval.transformer_prompts.hyde.multi | string | no | — | HyDE prompt for multiple hypothetical answers (uses {n} placeholder). | |
| defaults.rag.retrieval.transformer_prompts.decompose | string | no | — | Override prompt for the decompose query transformer. | |
| defaults.rag.retrieval.transformer_prompts.reformulate | string | no | — | Override prompt for the reformulate query transformer. | |
| defaults.cache_enabled | boolean | no | false | Activate a global in-memory LLM response cache (LangChain InMemoryCache). Identical prompts return cached results. | |
| tools | dict[string, object] | no | {} | Global built-in tool declarations, keyed by tool name. | |
| tools[].class | string | no | — | — | |
| tools[].handler | string | no | — | Dotted import path to the Python function implementing this tool. | |
| tools[].description | string | no | "" | Tool description shown to the LLM for invocation decisions. | |
| tools[].parameters | dict[string, object] | no | {} | Parameter declarations for this tool (auto-extracted from Python signature when omitted). | |
| tools[].parameters[].type | string | no | string | Parameter type: 'string', 'int', 'float', or 'bool'. | |
| tools[].parameters[].description | string | no | "" | Human-readable parameter description. | |
| tools[].parameters[].required | boolean | no | true | Whether this parameter must be provided. | |
| tools[].parameters[].default | any | no | — | Default value when the parameter is not provided. | |
| tools[].inject_to_rag | boolean | no | false | Store this tool's results in the RAG context store. | |
| tools[].rag_ttl | int | no | — | Cache TTL for RAG-stored tool results (None = agent default). | |
| tools[].requires_approval | boolean | no | false | Pause and prompt for human approval before executing (HITL). | |
| tools[].parallel_safe | boolean | no | — | Declare the tool safe for parallel dispatch. | |
| tools[].rag.namespace | string | no | "" | Qdrant namespace for this tool's RAG data. | |
| tools[].rag.k | int | no | 5 | Number of chunks retrieved for this tool. | |
| tools[].rag.enabled | boolean | no | true | Enable RAG for this tool. | |
| tools[].rag.rag_ttl | int | no | 0 | RAG cache TTL for this tool in seconds. | |
| tools[].rag.max_context_chars | int | no | — | Max RAG context characters for this tool. | |
| tools[].rag.ingestion.strategy | string | no | — | Chunking strategy for this tool. | |
| tools[].rag.ingestion.chunk_size | int | no | 1000 | Chunk size for this tool. | |
| tools[].rag.ingestion.chunk_overlap | int | no | 200 | Chunk overlap for this tool. | |
| tools[].rag.ingestion.parent_chunk_size | int | no | 0 | Parent chunk size for hierarchical chunking. | |
| tools[].rag.ingestion.parent_chunk_overlap | int | no | 200 | Parent chunk overlap. | |
| tools[].rag.ingestion.post_processors | list[string] | no | [] | Post-processors applied after chunking. | |
| tools[].rag.retrieval.strategy | string | no | — | Retrieval strategy for this tool. | |
| tools[].rag.retrieval.query_transformers | list[string] | no | — | Query transformer chain for this tool. | |
| tools[].rag.retrieval.metadata_filters | dict[string, any] | no | {} | Metadata filter expressions for this tool. | |
| tools[].rag.retrieval.exclude_dynamic | boolean | no | false | Exclude dynamically-injected output from retrieval. | |
| tools[].rag.retrieval.hyde.n_hypothetical | int | no | 1 | Number of hypothetical answers for HyDE queries. | |
| tools[].rag.retrieval.hybrid.sparse_encoder | string | no | bm25 | Sparse encoder for hybrid retrieval. | |
| tools[].rag.retrieval.hybrid.sparse_weight | float | no | 0.4 | Weight of sparse score in linear fusion. | |
| tools[].rag.retrieval.hybrid.fusion | "rrf" | "linear" | no | rrf | Fusion method for hybrid retrieval. | |
| tools[].rag.retrieval.hybrid.rrf_k | int | no | 60 | RRF constant k. | |
| tools[].rag.retrieval.graph.enabled | boolean | no | false | Enable graph-based retrieval. | |
| tools[].rag.retrieval.graph.max_hops | int | no | 2 | Maximum BFS depth for graph traversal. | |
| tools[].rag.retrieval.graph.fuse_with_vectors | boolean | no | true | Merge graph context with vector hits. | |
| tools[].rag.retrieval.graph.relation_filter | list[string] | no | [] | Restrict graph traversal to these edge labels. | |
| tools[].rag.retrieval.transformer_prompts.multi_query | string | no | — | Override multi-query prompt for this tool. | |
| tools[].rag.retrieval.transformer_prompts.hyde.single | string | no | — | HyDE single-answer prompt for this tool. | |
| tools[].rag.retrieval.transformer_prompts.hyde.multi | string | no | — | HyDE multi-answer prompt for this tool. | |
| tools[].rag.retrieval.transformer_prompts.decompose | string | no | — | Override decompose prompt for this tool. | |
| tools[].rag.retrieval.transformer_prompts.reformulate | string | no | — | Override reformulate prompt for this tool. | |
| tools[].rag.payload_indexes | dict[string, string] | no | {} | Explicit Qdrant payload index declarations for this tool. | |
| skills | dict[string, object] | no | {} | Orchestrator-level cross-agent skill definitions. | |
| skills[].description | string | no | "" | Human-readable skill purpose. | |
| skills[].steps | list[object] | yes | — | Ordered list of agent invocations forming this cross-agent skill. | |
| skills[].steps[].agent | string | yes | — | Agent name to invoke in this step. | |
| skills[].steps[].instruction | string | no | "" | Hint passed to the invoked agent. | |
| supervisor.assistant_name | string | no | AI assistant | Display name for the orchestrating supervisor. | |
| supervisor.fallback_model | string | no | — | Fallback LLM for the supervisor (overrides defaults.llm.fallback_model). | |
| supervisor.streaming_enabled | boolean | no | true | Enable server-sent events (SSE) streaming for responses. | |
| supervisor.routing_system_prompt | string | no | — | Custom system prompt for the supervisor's routing phase. | |
| supervisor.synthesis_system_prompt | string | no | — | Custom system prompt for the synthesis phase. | |
| supervisor.sequential_advance_prompt | string | no | — | Custom handoff prompt for sequential multi-agent flows. | |
| supervisor.history_max_turns | int | no | 20 | Maximum conversation exchange pairs retained in context. | |
| supervisor.history_max_chars | int | no | 1000 | Maximum characters per message before truncation. | |
| supervisor.routing_model | string | no | — | Cheaper model for routing and advance phases only. | |
| supervisor.history_summary_enabled | boolean | no | true | Enable sliding-window summarization to compress older turns. ✓ Best practice:Enable for long-running chats with token-priced LLMs where context accumulates over many turns. Disable for short-form workflows where keeping the full verbatim history is cheaper than the summarization LLM call. | |
| supervisor.history_summary_model | string | no | — | LLM model used for context summarization (None = supervisor model). | |
| supervisor.history_summary_recent_turns | int | no | 10 | Number of recent turns kept verbatim during summarization. | |
| supervisor.skip_synthesis_when_single_agent | boolean | no | true | Return a single agent's response directly without a synthesis LLM call. ✓ Best practice:Leave enabled (default) to save 5–15 s and one LLM call on every single-agent turn. Disable only if the supervisor must always rewrite or augment the agent's raw output regardless of routing. | |
| supervisor.memory.strategy | "none" | "running_summary" | "rag_augmented" | no | none | — | |
| supervisor.memory.summary_recent_turns | int | no | 10 | — | |
| supervisor.memory.summary_model | string | no | — | — | |
| supervisor.memory.summary_prompt | string | no | — | — | |
| supervisor.memory.persist_summary | boolean | no | true | — | |
| supervisor.memory.structured_output | boolean | no | true | — | |
| supervisor.memory.rag_namespace | string | no | __memory__ | — | |
| supervisor.memory.rag_k | int | no | 5 | — | |
| supervisor.memory.rag_similarity_threshold | float | no | 0.5 | — | |
| supervisor.memory.store_turns | boolean | no | true | — | |
| supervisor.memory.truncation_strategy | "hard" | "middle" | "llm" | "semantic" | no | hard | — | |
| supervisor.memory.truncation_max_chars | int | no | 1000 | — | |
| guardrails.input | list[object] | no | [] | Guardrail rules applied to user input before any agent processes it. | |
| guardrails.input[].type | string | yes | — | Guardrail type name (e.g. 'content_safety', 'pii_detection', 'prompt_injection'). | |
| guardrails.input[].fail_action | "block" | "warn" | "redact" | "log" | no | block | Action on guardrail failure: 'block', 'warn', 'redact', or 'log'. | |
| guardrails.input[].config | dict[string, any] | no | {} | Keyword arguments passed to the guardrail constructor. | |
| guardrails.output | list[object] | no | [] | Guardrail rules applied to agent responses before delivery. | |
| guardrails.output[].type | string | yes | — | Guardrail type name. | |
| guardrails.output[].fail_action | "block" | "warn" | "redact" | "log" | no | block | Action on guardrail failure. | |
| guardrails.output[].config | dict[string, any] | no | {} | Keyword arguments passed to the guardrail constructor. | |
| mcp_gateway.tools | dict[string, object] | no | {} | Tool title/description overrides keyed by MCP tool name. | |
| mcp_gateway.tools[].title | string | no | — | Override title for the MCP tool shown to host LLMs. | |
| mcp_gateway.tools[].description | string | no | — | Override description for the MCP tool shown to host LLMs. | |
| mcp_gateway.prompts | list[object] | no | [] | Pre-canned MCP Prompt templates exposed by the gateway. | |
| mcp_gateway.prompts[].name | string | yes | — | Unique prompt handle (must match ^[a-zA-Z_][a-zA-Z0-9_-]*$). | |
| mcp_gateway.prompts[].title | string | no | — | Display title for the prompt. | |
| mcp_gateway.prompts[].description | string | no | — | Description of what this prompt does. | |
| mcp_gateway.prompts[].arguments | list[object] | no | [] | Arguments accepted by this prompt template. | |
| mcp_gateway.prompts[].arguments[].name | string | yes | — | Argument name (must match ^[a-zA-Z_][a-zA-Z0-9_-]*$). | |
| mcp_gateway.prompts[].arguments[].description | string | no | — | Description of this argument. | |
| mcp_gateway.prompts[].arguments[].required | boolean | no | false | Whether this argument must be provided. | |
| mcp_gateway.prompts[].template | string | yes | — | Prompt body with {{arg_name}} placeholders. | |
| agents | dict[string, object] | no | {} | Map of agent name → agent configuration. Every key becomes a routable agent. | |
| agents[].name | string | no | "" | Agent name (set automatically from the YAML dict key). | |
| agents[].description | string | yes | — | Human-readable purpose shown to the supervisor for routing. | |
| agents[].prompt | string | yes | — | System prompt injected into the agent's agentic loop. | |
| agents[].class | string | no | — | Dotted import path to a custom OrchidAgent subclass. | |
| agents[].rag.namespace | string | no | "" | Qdrant collection namespace for this agent. | |
| agents[].rag.k | int | no | 5 | Number of chunks retrieved per RAG query for this agent. | |
| agents[].rag.enabled | boolean | no | true | Enable RAG for this agent. | |
| agents[].rag.rag_ttl | int | no | 0 | RAG cache TTL for this agent in seconds. | |
| agents[].rag.max_context_chars | int | no | — | Maximum RAG context characters for this agent. | |
| agents[].rag.ingestion.strategy | string | no | — | Chunking strategy for this agent. | |
| agents[].rag.ingestion.chunk_size | int | no | 1000 | Chunk size for this agent. | |
| agents[].rag.ingestion.chunk_overlap | int | no | 200 | Chunk overlap for this agent. | |
| agents[].rag.ingestion.parent_chunk_size | int | no | 0 | Parent chunk size for hierarchical chunking (0 = disabled). | |
| agents[].rag.ingestion.parent_chunk_overlap | int | no | 200 | Overlap for parent chunks. | |
| agents[].rag.ingestion.post_processors | list[string] | no | [] | Post-processors applied after chunking. | |
| agents[].rag.retrieval.strategy | string | no | — | Retrieval strategy for this agent. ✓ Best practice:Start with `simple` (cosine similarity). Switch to `multi_query` when single-query retrieval misses paraphrased or ambiguous questions. Use `hyde` for domains where hypothetical answers improve recall (dense technical knowledge). Use `hybrid` to combine sparse and dense rankings when exact keyword matches matter. | |
| agents[].rag.retrieval.query_transformers | list[string] | no | — | Query transformer chain for this agent. | |
| agents[].rag.retrieval.metadata_filters | dict[string, any] | no | {} | Metadata filter expressions for this agent's retrievals. | |
| agents[].rag.retrieval.exclude_dynamic | boolean | no | false | Exclude dynamically-injected tool output from retrieval. | |
| agents[].rag.retrieval.hyde.n_hypothetical | int | no | 1 | Number of hypothetical answers for HyDE queries. | |
| agents[].rag.retrieval.hybrid.sparse_encoder | string | no | bm25 | Sparse encoder: 'bm25' or 'splade'. | |
| agents[].rag.retrieval.hybrid.sparse_weight | float | no | 0.4 | Weight of sparse score in linear fusion. | |
| agents[].rag.retrieval.hybrid.fusion | "rrf" | "linear" | no | rrf | Fusion strategy: 'rrf' or 'linear'. | |
| agents[].rag.retrieval.hybrid.rrf_k | int | no | 60 | RRF constant k. | |
| agents[].rag.retrieval.graph.enabled | boolean | no | false | Enable graph-based retrieval. | |
| agents[].rag.retrieval.graph.max_hops | int | no | 2 | Maximum BFS depth for graph traversal. | |
| agents[].rag.retrieval.graph.fuse_with_vectors | boolean | no | true | Merge graph context with vector hits. | |
| agents[].rag.retrieval.graph.relation_filter | list[string] | no | [] | Restrict graph traversal to these edge labels. | |
| agents[].rag.retrieval.transformer_prompts.multi_query | string | no | — | Override prompt for the multi-query transformer for this agent. | |
| agents[].rag.retrieval.transformer_prompts.hyde.single | string | no | — | HyDE single-answer prompt for this agent. | |
| agents[].rag.retrieval.transformer_prompts.hyde.multi | string | no | — | HyDE multi-answer prompt for this agent. | |
| agents[].rag.retrieval.transformer_prompts.decompose | string | no | — | Override prompt for the decompose transformer for this agent. | |
| agents[].rag.retrieval.transformer_prompts.reformulate | string | no | — | Override prompt for the reformulate transformer for this agent. | |
| agents[].rag.payload_indexes | dict[string, string] | no | {} | Explicit Qdrant payload index declarations (field → schema type). | |
| agents[].mcp_servers | list[object] | no | [] | MCP servers this agent can call tools on. | |
| agents[].mcp_servers[].name | string | yes | — | Unique identifier for the MCP server. | |
| agents[].mcp_servers[].type | "local" | "remote" | no | local | Server type: 'local' (same host) or 'remote'. | |
| agents[].mcp_servers[].transport | "streamable_http" | "sse" | no | streamable_http | Transport protocol: 'streamable_http' or 'sse'. | |
| agents[].mcp_servers[].url | string | yes | — | MCP server URL. Supports ${ENV_VAR} interpolation. | |
| agents[].mcp_servers[].auth.mode | "none" | "passthrough" | "oauth" | no | none | Auth mode: 'none', 'passthrough', or 'oauth'. ✓ Best practice:Default to `none` for local MCP servers that need no credentials. Use `passthrough` when the MCP server shares the same identity provider as your API — it forwards the graph bearer token unchanged. Use `oauth` when each user must independently authorize the MCP server; Orchid handles RFC 9728 discovery and RFC 7591 dynamic client registration automatically. | |
| agents[].mcp_servers[].tools | list[object] | no | [] | MCP tools this agent is allowed to call on this server. | |
| agents[].mcp_servers[].tools[].name | string | yes | — | MCP tool name to allow. | |
| agents[].mcp_servers[].tools[].arguments | dict[string, any] | no | {} | Default arguments passed to this tool. | |
| agents[].mcp_servers[].tools[].inject_to_rag | boolean | no | false | Store this tool's results in the RAG context store. | |
| agents[].mcp_servers[].tools[].rag_ttl | int | no | — | Per-tool RAG cache TTL (None = agent default). | |
| agents[].mcp_servers[].tools[].requires_approval | boolean | no | false | Require human approval before this tool executes (HITL). | |
| agents[].mcp_servers[].tools[].parallel_safe | boolean | no | — | Override parallel-safety for this tool. | |
| agents[].mcp_servers[].tools[].rag.namespace | string | no | "" | Qdrant namespace for this tool's RAG data. | |
| agents[].mcp_servers[].tools[].rag.k | int | no | 5 | Number of chunks retrieved for this tool. | |
| agents[].mcp_servers[].tools[].rag.enabled | boolean | no | true | Enable RAG for this tool. | |
| agents[].mcp_servers[].tools[].rag.rag_ttl | int | no | 0 | RAG cache TTL for this tool. | |
| agents[].mcp_servers[].tools[].rag.max_context_chars | int | no | — | Max RAG context characters for this tool. | |
| agents[].mcp_servers[].tools[].rag.ingestion.strategy | string | no | — | Chunking strategy for this tool. | |
| agents[].mcp_servers[].tools[].rag.ingestion.chunk_size | int | no | 1000 | Chunk size for this tool. | |
| agents[].mcp_servers[].tools[].rag.ingestion.chunk_overlap | int | no | 200 | Chunk overlap for this tool. | |
| agents[].mcp_servers[].tools[].rag.ingestion.parent_chunk_size | int | no | 0 | Parent chunk size for this tool. | |
| agents[].mcp_servers[].tools[].rag.ingestion.parent_chunk_overlap | int | no | 200 | Parent chunk overlap for this tool. | |
| agents[].mcp_servers[].tools[].rag.ingestion.post_processors | list[string] | no | [] | Post-processors for this tool. | |
| agents[].mcp_servers[].tools[].rag.retrieval.strategy | string | no | — | Retrieval strategy for this tool. | |
| agents[].mcp_servers[].tools[].rag.retrieval.query_transformers | list[string] | no | — | Query transformers for this tool. | |
| agents[].mcp_servers[].tools[].rag.retrieval.metadata_filters | dict[string, any] | no | {} | Metadata filters for this tool. | |
| agents[].mcp_servers[].tools[].rag.retrieval.exclude_dynamic | boolean | no | false | Exclude dynamic output from retrieval. | |
| agents[].mcp_servers[].tools[].rag.retrieval.hyde.n_hypothetical | int | no | 1 | HyDE hypothetical answer count. | |
| agents[].mcp_servers[].tools[].rag.retrieval.hybrid.sparse_encoder | string | no | bm25 | Sparse encoder for hybrid retrieval. | |
| agents[].mcp_servers[].tools[].rag.retrieval.hybrid.sparse_weight | float | no | 0.4 | Sparse weight for linear fusion. | |
| agents[].mcp_servers[].tools[].rag.retrieval.hybrid.fusion | "rrf" | "linear" | no | rrf | Fusion method. | |
| agents[].mcp_servers[].tools[].rag.retrieval.hybrid.rrf_k | int | no | 60 | RRF constant k. | |
| agents[].mcp_servers[].tools[].rag.retrieval.graph.enabled | boolean | no | false | Enable graph retrieval. | |
| agents[].mcp_servers[].tools[].rag.retrieval.graph.max_hops | int | no | 2 | Graph traversal depth. | |
| agents[].mcp_servers[].tools[].rag.retrieval.graph.fuse_with_vectors | boolean | no | true | Merge graph with vector hits. | |
| agents[].mcp_servers[].tools[].rag.retrieval.graph.relation_filter | list[string] | no | [] | Edge label filter. | |
| agents[].mcp_servers[].tools[].rag.retrieval.transformer_prompts.multi_query | string | no | — | Override multi-query prompt. | |
| agents[].mcp_servers[].tools[].rag.retrieval.transformer_prompts.decompose | string | no | — | Override decompose prompt. | |
| agents[].mcp_servers[].tools[].rag.retrieval.transformer_prompts.reformulate | string | no | — | Override reformulate prompt. | |
| agents[].mcp_servers[].tools[].rag.payload_indexes | dict[string, string] | no | {} | Qdrant payload index declarations. | |
| agents[].mcp_servers[].prompts | list[string] | no | [] | MCP prompt names to load ('*' = discover all). | |
| agents[].mcp_servers[].resources | list[string] | no | [] | MCP resource URIs to load ('*' = discover all). | |
| agents[].mcp_servers[].tool_call_strategy | string | no | all | How tools are dispatched: 'all', 'sequential', 'llm_decides'. | |
| agents[].mcp_servers[].discover_all_tools | boolean | no | false | Discover all tools from the server at runtime. | |
| agents[].mcp_servers[].discover_all_prompts | boolean | no | false | Discover all prompts from the server at runtime. | |
| agents[].mcp_servers[].discover_all_resources | boolean | no | false | Discover all resources from the server at runtime. | |
| agents[].llm.model | string | no | gemini/gemini-2.5-flash | Per-agent LLM model override. | |
| agents[].llm.temperature | float | no | 0.2 | Per-agent sampling temperature. | |
| agents[].llm.fallback_model | string | no | — | Per-agent fallback model. | |
| agents[].llm.retry_attempts | int | no | 0 | Per-agent retry count on transient errors. | |
| agents[].execution_hints.parallel_safe | boolean | no | true | Mark this agent safe to run in parallel with other agents. | |
| agents[].tools | list[string] | no | [] | Built-in tool names available to this agent (must match keys in top-level tools:). | |
| agents[].skills | dict[string, object] | no | {} | Per-agent skill definitions (multi-step workflows within this agent). | |
| agents[].skills[].description | string | no | "" | Human-readable skill description. | |
| agents[].skills[].steps | list[object] | yes | — | Ordered steps: tool calls or agent invocations. | |
| agents[].skills[].steps[].tool | string | no | — | Tool name (MCP tool or built-in) for this step. | |
| agents[].skills[].steps[].source | string | no | — | MCP server name, 'builtin', or None (= builtin). | |
| agents[].skills[].steps[].arguments | dict[string, any] | no | {} | Static arguments for this tool call. | |
| agents[].skills[].steps[].agent | string | no | — | Agent name to invoke directly (bypasses supervisor). | |
| agents[].skills[].steps[].instruction | string | no | "" | Query/instruction sent to the invoked agent. | |
| agents[].guardrails.input | list[object] | no | [] | Per-agent input guardrail rules. | |
| agents[].guardrails.input[].type | string | yes | — | Guardrail type name. | |
| agents[].guardrails.input[].fail_action | "block" | "warn" | "redact" | "log" | no | block | Action on guardrail failure. | |
| agents[].guardrails.input[].config | dict[string, any] | no | {} | Guardrail constructor kwargs. | |
| agents[].guardrails.output | list[object] | no | [] | Per-agent output guardrail rules. | |
| agents[].guardrails.output[].type | string | yes | — | Guardrail type name. | |
| agents[].guardrails.output[].fail_action | "block" | "warn" | "redact" | "log" | no | block | Action on guardrail failure. | |
| agents[].guardrails.output[].config | dict[string, any] | no | {} | Guardrail constructor kwargs. | |
| agents[].children | dict[string, object] | no | — | Sub-agent configurations nested under this agent. | |
| agents[].children[].name | string | no | "" | Agent name (set automatically from the YAML dict key). | |
| agents[].children[].description | string | yes | — | Human-readable purpose shown to the supervisor for routing. | |
| agents[].children[].prompt | string | yes | — | System prompt injected into the agent's agentic loop. | |
| agents[].children[].class | string | no | — | Dotted import path to a custom OrchidAgent subclass. | |
| agents[].children[].rag.namespace | string | no | "" | Qdrant collection namespace for this agent. | |
| agents[].children[].rag.k | int | no | 5 | Number of chunks retrieved per RAG query for this agent. | |
| agents[].children[].rag.enabled | boolean | no | true | Enable RAG for this agent. | |
| agents[].children[].rag.rag_ttl | int | no | 0 | RAG cache TTL for this agent in seconds. | |
| agents[].children[].rag.max_context_chars | int | no | — | Maximum RAG context characters for this agent. | |
| agents[].children[].rag.ingestion.strategy | string | no | — | Chunking strategy for this agent. | |
| agents[].children[].rag.ingestion.chunk_size | int | no | 1000 | Chunk size for this agent. | |
| agents[].children[].rag.ingestion.chunk_overlap | int | no | 200 | Chunk overlap for this agent. | |
| agents[].children[].rag.ingestion.parent_chunk_size | int | no | 0 | Parent chunk size for hierarchical chunking (0 = disabled). | |
| agents[].children[].rag.ingestion.parent_chunk_overlap | int | no | 200 | Overlap for parent chunks. | |
| agents[].children[].rag.ingestion.post_processors | list[string] | no | [] | Post-processors applied after chunking. | |
| agents[].children[].rag.retrieval.strategy | string | no | — | Retrieval strategy for this agent. | |
| agents[].children[].rag.retrieval.query_transformers | list[string] | no | — | Query transformer chain for this agent. | |
| agents[].children[].rag.retrieval.metadata_filters | dict[string, any] | no | {} | Metadata filter expressions for this agent's retrievals. | |
| agents[].children[].rag.retrieval.exclude_dynamic | boolean | no | false | Exclude dynamically-injected tool output from retrieval. | |
| agents[].children[].rag.retrieval.hyde.n_hypothetical | int | no | 1 | Number of hypothetical answers for HyDE queries. | |
| agents[].children[].rag.retrieval.hybrid.sparse_encoder | string | no | bm25 | Sparse encoder: 'bm25' or 'splade'. | |
| agents[].children[].rag.retrieval.hybrid.sparse_weight | float | no | 0.4 | Weight of sparse score in linear fusion. | |
| agents[].children[].rag.retrieval.hybrid.fusion | "rrf" | "linear" | no | rrf | Fusion strategy: 'rrf' or 'linear'. | |
| agents[].children[].rag.retrieval.hybrid.rrf_k | int | no | 60 | RRF constant k. | |
| agents[].children[].rag.retrieval.graph.enabled | boolean | no | false | Enable graph-based retrieval. | |
| agents[].children[].rag.retrieval.graph.max_hops | int | no | 2 | Maximum BFS depth for graph traversal. | |
| agents[].children[].rag.retrieval.graph.fuse_with_vectors | boolean | no | true | Merge graph context with vector hits. | |
| agents[].children[].rag.retrieval.graph.relation_filter | list[string] | no | [] | Restrict graph traversal to these edge labels. | |
| agents[].children[].rag.retrieval.transformer_prompts.multi_query | string | no | — | Override prompt for the multi-query transformer for this agent. | |
| agents[].children[].rag.retrieval.transformer_prompts.hyde.single | string | no | — | HyDE single-answer prompt for this agent. | |
| agents[].children[].rag.retrieval.transformer_prompts.hyde.multi | string | no | — | HyDE multi-answer prompt for this agent. | |
| agents[].children[].rag.retrieval.transformer_prompts.decompose | string | no | — | Override prompt for the decompose transformer for this agent. | |
| agents[].children[].rag.retrieval.transformer_prompts.reformulate | string | no | — | Override prompt for the reformulate transformer for this agent. | |
| agents[].children[].rag.payload_indexes | dict[string, string] | no | {} | Explicit Qdrant payload index declarations (field → schema type). | |
| agents[].children[].mcp_servers | list[object] | no | [] | MCP servers this agent can call tools on. | |
| agents[].children[].mcp_servers[].name | string | yes | — | Unique identifier for the MCP server. | |
| agents[].children[].mcp_servers[].type | "local" | "remote" | no | local | Server type: 'local' (same host) or 'remote'. | |
| agents[].children[].mcp_servers[].transport | "streamable_http" | "sse" | no | streamable_http | Transport protocol: 'streamable_http' or 'sse'. | |
| agents[].children[].mcp_servers[].url | string | yes | — | MCP server URL. Supports ${ENV_VAR} interpolation. | |
| agents[].children[].mcp_servers[].auth.mode | "none" | "passthrough" | "oauth" | no | none | Auth mode: 'none', 'passthrough', or 'oauth'. | |
| agents[].children[].mcp_servers[].tools | list[object] | no | [] | MCP tools this agent is allowed to call on this server. | |
| agents[].children[].mcp_servers[].tools[].name | string | yes | — | MCP tool name to allow. | |
| agents[].children[].mcp_servers[].tools[].arguments | dict[string, any] | no | {} | Default arguments passed to this tool. | |
| agents[].children[].mcp_servers[].tools[].inject_to_rag | boolean | no | false | Store this tool's results in the RAG context store. | |
| agents[].children[].mcp_servers[].tools[].rag_ttl | int | no | — | Per-tool RAG cache TTL (None = agent default). | |
| agents[].children[].mcp_servers[].tools[].requires_approval | boolean | no | false | Require human approval before this tool executes (HITL). | |
| agents[].children[].mcp_servers[].tools[].parallel_safe | boolean | no | — | Override parallel-safety for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.namespace | string | no | "" | Qdrant namespace for this tool's RAG data. | |
| agents[].children[].mcp_servers[].tools[].rag.k | int | no | 5 | Number of chunks retrieved for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.enabled | boolean | no | true | Enable RAG for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.rag_ttl | int | no | 0 | RAG cache TTL for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.max_context_chars | int | no | — | Max RAG context characters for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.ingestion.strategy | string | no | — | Chunking strategy for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.ingestion.chunk_size | int | no | 1000 | Chunk size for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.ingestion.chunk_overlap | int | no | 200 | Chunk overlap for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.ingestion.parent_chunk_size | int | no | 0 | Parent chunk size for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.ingestion.parent_chunk_overlap | int | no | 200 | Parent chunk overlap for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.ingestion.post_processors | list[string] | no | [] | Post-processors for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.retrieval.strategy | string | no | — | Retrieval strategy for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.retrieval.query_transformers | list[string] | no | — | Query transformers for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.retrieval.metadata_filters | dict[string, any] | no | {} | Metadata filters for this tool. | |
| agents[].children[].mcp_servers[].tools[].rag.retrieval.exclude_dynamic | boolean | no | false | Exclude dynamic output from retrieval. | |
| agents[].children[].mcp_servers[].tools[].rag.payload_indexes | dict[string, string] | no | {} | Qdrant payload index declarations. | |
| agents[].children[].mcp_servers[].prompts | list[string] | no | [] | MCP prompt names to load ('*' = discover all). | |
| agents[].children[].mcp_servers[].resources | list[string] | no | [] | MCP resource URIs to load ('*' = discover all). | |
| agents[].children[].mcp_servers[].tool_call_strategy | string | no | all | How tools are dispatched: 'all', 'sequential', 'llm_decides'. | |
| agents[].children[].mcp_servers[].discover_all_tools | boolean | no | false | Discover all tools from the server at runtime. | |
| agents[].children[].mcp_servers[].discover_all_prompts | boolean | no | false | Discover all prompts from the server at runtime. | |
| agents[].children[].mcp_servers[].discover_all_resources | boolean | no | false | Discover all resources from the server at runtime. | |
| agents[].children[].llm.model | string | no | gemini/gemini-2.5-flash | Per-agent LLM model override. | |
| agents[].children[].llm.temperature | float | no | 0.2 | Per-agent sampling temperature. | |
| agents[].children[].llm.fallback_model | string | no | — | Per-agent fallback model. | |
| agents[].children[].llm.retry_attempts | int | no | 0 | Per-agent retry count on transient errors. | |
| agents[].children[].execution_hints.parallel_safe | boolean | no | true | Mark this agent safe to run in parallel with other agents. | |
| agents[].children[].tools | list[string] | no | [] | Built-in tool names available to this agent (must match keys in top-level tools:). | |
| agents[].children[].skills | dict[string, object] | no | {} | Per-agent skill definitions (multi-step workflows within this agent). | |
| agents[].children[].skills[].description | string | no | "" | Human-readable skill description. | |
| agents[].children[].skills[].steps | list[object] | yes | — | Ordered steps: tool calls or agent invocations. | |
| agents[].children[].skills[].steps[].tool | string | no | — | Tool name (MCP tool or built-in) for this step. | |
| agents[].children[].skills[].steps[].source | string | no | — | MCP server name, 'builtin', or None (= builtin). | |
| agents[].children[].skills[].steps[].arguments | dict[string, any] | no | {} | Static arguments for this tool call. | |
| agents[].children[].skills[].steps[].agent | string | no | — | Agent name to invoke directly (bypasses supervisor). | |
| agents[].children[].skills[].steps[].instruction | string | no | "" | Query/instruction sent to the invoked agent. | |
| agents[].children[].guardrails.input | list[object] | no | [] | Per-agent input guardrail rules. | |
| agents[].children[].guardrails.input[].type | string | yes | — | Guardrail type name. | |
| agents[].children[].guardrails.input[].fail_action | "block" | "warn" | "redact" | "log" | no | block | Action on guardrail failure. | |
| agents[].children[].guardrails.input[].config | dict[string, any] | no | {} | Guardrail constructor kwargs. | |
| agents[].children[].guardrails.output | list[object] | no | [] | Per-agent output guardrail rules. | |
| agents[].children[].guardrails.output[].type | string | yes | — | Guardrail type name. | |
| agents[].children[].guardrails.output[].fail_action | "block" | "warn" | "redact" | "log" | no | block | Action on guardrail failure. | |
| agents[].children[].guardrails.output[].config | dict[string, any] | no | {} | Guardrail constructor kwargs. | |
| agents[].children[].parallel_tools | boolean | no | false | Dispatch independent read-only tool calls in parallel within one turn. | |
| agents[].children[].max_tool_rounds | int | no | 15 | — | |
| agents[].children[].max_consecutive_dupes | int | no | 2 | — | |
| agents[].children[].max_skill_depth | int | no | 3 | — | |
| agents[].children[].mini_agent.enabled | boolean | no | false | Enable the mini-agent (self-clone) decomposition pattern. | |
| agents[].children[].mini_agent.max_count | int | no | 3 | Maximum number of parallel mini-agents (2–8). | |
| agents[].children[].mini_agent.decomposer_model | string | no | — | LLM model for the mini-agent decomposer (None = agent model). | |
| agents[].children[].mini_agent.timeout_seconds | int | no | 60 | Per-mini-agent timeout in seconds. | |
| agents[].children[].mini_agent.tool_allowlist_mode | "strict" | "parent_full" | "inferred" | no | strict | Tool exposure mode: 'strict', 'parent_full', or 'inferred'. | |
| agents[].children[].mini_agent.stream_inner_tokens | boolean | no | false | Stream individual mini-agent tokens to the SSE endpoint. | |
| agents[].children[].mini_agent.decomposer_prompt | string | no | — | Custom prompt for the decomposer step. | |
| agents[].children[].mini_agent.aggregator_prompt | string | no | — | Custom prompt for the aggregator step. | |
| agents[].children[].mini_agent.system_prompt_template | string | no | — | Template for each mini's system prompt ({parent_prompt}, {instruction}, {tool_list}). | |
| agents[].children[].prompt_sections.prior_results_header | string | no | --- Previous Tool Results (from prior turns) --- | Header shown before the prior tool-results JSON block. | |
| agents[].children[].prompt_sections.mcp_prompt_template | string | no | --- MCP Prompt: {name} --- {text} | Template for rendered MCP prompts. | |
| agents[].children[].prompt_sections.skipped_prompt_template | string | no | [Available prompt: {name}] {description} (requires: {required_args}) | Template for MCP prompts that require arguments (shown as available, not rendered). | |
| agents[].children[].prompt_sections.resources_header | string | no | --- Available Resources --- | Header shown before the MCP resources block. | |
| agents[].children[].prompt_sections.resource_template | string | no | [{name}] {content} | Template for each MCP resource body. | |
| agents[].children[].prompt_sections.rag_header | string | no | --- Background Knowledge (RAG) --- | Header shown before the RAG context block. | |
| agents[].children[].prompt_sections.prior_results_max_chars | int | no | 4000 | Character cap on the prior tool-results JSON block. | |
| agents[].children[].prompt_sections.resource_max_chars | int | no | 2000 | Character cap per MCP resource body. | |
| agents[].children[].prompt_sections.summarise_history_reminder | string | no | IMPORTANT: The conversation history below shows prior exchanges. Always focus on the user's LATEST message and its relationship to the most recent topic. Do NOT change topic or introduce unrelated content unless the user explicitly asks for something new. | Reminder block appended to the summarise system prompt when conversation history is present. | |
| agents[].children[].prompt_sections.summarise_prior_results_header | string | no | --- Previous Tool Results (from prior turns) --- | Header for prior-turn tool results in the summarise system prompt. | |
| agents[].children[].prompt_sections.summarise_rag_section_header | string | no | Background knowledge (from RAG): | Header for the RAG block in the summarise user message. | |
| agents[].children[].prompt_sections.summarise_user_template | string | no | User query: {query} {rag_section}Live data (from API): {mcp_data} | User-content template for the summarise call ({query}, {rag_section}, {mcp_data}). | |
| agents[].children[].prompt_sections.summarise_prior_results_max_chars | int | no | 4000 | Max characters of prior-tool-results JSON in the summarise prompt. | |
| agents[].children[].prompt_sections.summary_compression_system_prompt | string | no | You are a conversation summarizer that produces structured summaries. Output ONLY valid JSON with this schema: { "topics": ["topic1", "topic2"], "entities": [ {"name": "entity_name", "type": "person|product|concept|other", "details": "key information"} ], "actions_taken": ["action1", "action2"], "decisions": ["decision1"], "open_questions": ["question1"], "user_preferences": ["preference1"], "narrative": "A brief prose summary of the conversation flow (2-3 sentences)", "covered_turns": 5 } Be factual and concise. Extract all entities, topics, and decisions mentioned. | — | |
| agents[].children[].prompt_sections.summary_compression_user_prompt | string | no | Summarise the following conversation excerpt in structured JSON format. Focus on: (1) the key topics discussed, (2) any entities or names mentioned, (3) actions taken or decisions made, (4) any outstanding questions. Be factual and concise. {transcript} | — | |
| agents[].children[].prompt_sections.summary_extension_system_prompt | string | no | You are a conversation summarizer that produces structured summaries. You have an existing summary and new messages to incorporate. Update the summary to reflect new information, remove contradicted facts, and merge duplicate entities. Output ONLY valid JSON with this schema: { "topics": ["topic1", "topic2"], "entities": [ {"name": "entity_name", "type": "person|product|concept|other", "details": "key information"} ], "actions_taken": ["action1", "action2"], "decisions": ["decision1"], "open_questions": ["question1"], "user_preferences": ["preference1"], "narrative": "A brief prose summary of the conversation flow (2-3 sentences)", "covered_turns": 5 } | — | |
| agents[].children[].prompt_sections.summary_extension_user_prompt | string | no | Given the existing summary below and the new conversation messages, produce an updated summary that incorporates all new information. Existing summary: {existing_summary} New messages: {new_messages} | — | |
| agents[].children[].prompt_sections.summary_narrative_fallback_prompt | string | no | Summarise the following conversation excerpt in 2-4 sentences. Focus on: (1) the key topics discussed, (2) any entities or names mentioned, (3) actions taken or decisions made, (4) any outstanding questions. Be factual and concise. {transcript} | — | |
| agents[].parallel_tools | boolean | no | false | Dispatch independent read-only tool calls in parallel within one turn. ✓ Best practice:Enable when an agent consistently makes multiple independent read-only tool calls per turn (e.g. fetching several data sources in one round). Keep disabled for write operations or any tool chain where order guarantees matter — parallel dispatch removes sequencing. | |
| agents[].max_tool_rounds | int | no | 15 | — | |
| agents[].max_consecutive_dupes | int | no | 2 | — | |
| agents[].max_skill_depth | int | no | 3 | — | |
| agents[].mini_agent.enabled | boolean | no | false | Enable the mini-agent (self-clone) decomposition pattern. ✓ Best practice:Enable when a single complex user request can be decomposed into independent sub-tasks that do not share state. The decomposer adds one extra LLM call per turn; only opt in when the parallelism speedup outweighs that cost. Nesting is not supported — only top-level agents can enable mini-agents. | |
| agents[].mini_agent.max_count | int | no | 3 | Maximum number of parallel mini-agents (2–8). | |
| agents[].mini_agent.decomposer_model | string | no | — | LLM model for the mini-agent decomposer (None = agent model). | |
| agents[].mini_agent.timeout_seconds | int | no | 60 | Per-mini-agent timeout in seconds. | |
| agents[].mini_agent.tool_allowlist_mode | "strict" | "parent_full" | "inferred" | no | strict | Tool exposure mode: 'strict', 'parent_full', or 'inferred'. | |
| agents[].mini_agent.stream_inner_tokens | boolean | no | false | Stream individual mini-agent tokens to the SSE endpoint. | |
| agents[].mini_agent.decomposer_prompt | string | no | — | Custom prompt for the decomposer step. | |
| agents[].mini_agent.aggregator_prompt | string | no | — | Custom prompt for the aggregator step. | |
| agents[].mini_agent.system_prompt_template | string | no | — | Template for each mini's system prompt ({parent_prompt}, {instruction}, {tool_list}). | |
| agents[].prompt_sections.prior_results_header | string | no | --- Previous Tool Results (from prior turns) --- | Header shown before the prior tool-results JSON block. | |
| agents[].prompt_sections.mcp_prompt_template | string | no | --- MCP Prompt: {name} --- {text} | Template for rendered MCP prompts. | |
| agents[].prompt_sections.skipped_prompt_template | string | no | [Available prompt: {name}] {description} (requires: {required_args}) | Template for MCP prompts that require arguments (shown as available, not rendered). | |
| agents[].prompt_sections.resources_header | string | no | --- Available Resources --- | Header shown before the MCP resources block. | |
| agents[].prompt_sections.resource_template | string | no | [{name}] {content} | Template for each MCP resource body. | |
| agents[].prompt_sections.rag_header | string | no | --- Background Knowledge (RAG) --- | Header shown before the RAG context block. | |
| agents[].prompt_sections.prior_results_max_chars | int | no | 4000 | Character cap on the prior tool-results JSON block. | |
| agents[].prompt_sections.resource_max_chars | int | no | 2000 | Character cap per MCP resource body. | |
| agents[].prompt_sections.summarise_history_reminder | string | no | IMPORTANT: The conversation history below shows prior exchanges. Always focus on the user's LATEST message and its relationship to the most recent topic. Do NOT change topic or introduce unrelated content unless the user explicitly asks for something new. | Reminder block appended to the summarise system prompt when conversation history is present. | |
| agents[].prompt_sections.summarise_prior_results_header | string | no | --- Previous Tool Results (from prior turns) --- | Header for prior-turn tool results in the summarise system prompt. | |
| agents[].prompt_sections.summarise_rag_section_header | string | no | Background knowledge (from RAG): | Header for the RAG block in the summarise user message. | |
| agents[].prompt_sections.summarise_user_template | string | no | User query: {query} {rag_section}Live data (from API): {mcp_data} | User-content template for the summarise call ({query}, {rag_section}, {mcp_data}). | |
| agents[].prompt_sections.summarise_prior_results_max_chars | int | no | 4000 | Max characters of prior-tool-results JSON in the summarise prompt. | |
| agents[].prompt_sections.summary_compression_system_prompt | string | no | You are a conversation summarizer that produces structured summaries. Output ONLY valid JSON with this schema: { "topics": ["topic1", "topic2"], "entities": [ {"name": "entity_name", "type": "person|product|concept|other", "details": "key information"} ], "actions_taken": ["action1", "action2"], "decisions": ["decision1"], "open_questions": ["question1"], "user_preferences": ["preference1"], "narrative": "A brief prose summary of the conversation flow (2-3 sentences)", "covered_turns": 5 } Be factual and concise. Extract all entities, topics, and decisions mentioned. | — | |
| agents[].prompt_sections.summary_compression_user_prompt | string | no | Summarise the following conversation excerpt in structured JSON format. Focus on: (1) the key topics discussed, (2) any entities or names mentioned, (3) actions taken or decisions made, (4) any outstanding questions. Be factual and concise. {transcript} | — | |
| agents[].prompt_sections.summary_extension_system_prompt | string | no | You are a conversation summarizer that produces structured summaries. You have an existing summary and new messages to incorporate. Update the summary to reflect new information, remove contradicted facts, and merge duplicate entities. Output ONLY valid JSON with this schema: { "topics": ["topic1", "topic2"], "entities": [ {"name": "entity_name", "type": "person|product|concept|other", "details": "key information"} ], "actions_taken": ["action1", "action2"], "decisions": ["decision1"], "open_questions": ["question1"], "user_preferences": ["preference1"], "narrative": "A brief prose summary of the conversation flow (2-3 sentences)", "covered_turns": 5 } | — | |
| agents[].prompt_sections.summary_extension_user_prompt | string | no | Given the existing summary below and the new conversation messages, produce an updated summary that incorporates all new information. Existing summary: {existing_summary} New messages: {new_messages} | — | |
| agents[].prompt_sections.summary_narrative_fallback_prompt | string | no | Summarise the following conversation excerpt in 2-4 sentences. Focus on: (1) the key topics discussed, (2) any entities or names mentioned, (3) actions taken or decisions made, (4) any outstanding questions. Be factual and concise. {transcript} | — | |
| allowed_passthrough_hosts | list[string] | no | [] | — | |
| events.enabled | boolean | no | false | Enable the Pollen + Bloom event-driven activation layer. | |
| events.store.class | string | yes | — | Dotted import path for the event store backend. | |
| events.store.extra_args | dict | no | {} | Additional keyword arguments for the store constructor. | |
| events.queue.class | string | yes | — | Dotted import path for the queue backend. | |
| events.queue.notify_enabled | boolean | no | true | Enable queue notifications. | |
| events.queue.poll_interval_ms | int | no | 200 | Queue poll interval in milliseconds (min 10). | |
| events.queue.lease_seconds | int | no | 30 | Message lease duration in seconds (min 1). | |
| events.queue.max_attempts | int | no | 5 | Maximum processing attempts per message (min 1). | |
| events.queue.dead_letter_table | string | no | signal_queue_dead_letter | Database table name for dead-letter messages. | |
| events.scheduler.class | string | yes | — | Dotted import path for the scheduler backend. | |
| events.scheduler.extra_args | dict | no | {} | Additional keyword arguments for the scheduler constructor. | |
| events.producers | list[object] | no | [] | List of signal producer configurations. | |
| events.producers[].class | string | yes | — | Dotted import path for the producer. | |
| events.producers[].extra_args | dict | no | {} | Additional keyword arguments for the producer constructor. | |
| events.processors | list[object] | no | [] | List of signal processor / worker-pool configurations. | |
| events.processors[].class | string | yes | — | Dotted import path for the processor. | |
| events.processors[].concurrency | int | no | 4 | Number of concurrent worker tasks (min 1). | |
| events.processors[].poll_interval_ms | int | no | 200 | Processor poll interval in milliseconds (min 10). | |
| events.processors[].lease_seconds | int | no | 30 | Message lease duration in seconds (min 1). | |
| events.processors[].max_attempts | int | no | 5 | Maximum processing attempts (min 1). | |
| events.processors[].drain_timeout_seconds | float | no | 10 | Seconds to wait for in-flight messages during shutdown. | |
| events.middleware | list[object] | no | [] | Processing middleware applied to signals. | |
| events.middleware[].class | string | yes | — | Dotted import path for the middleware. | |
| events.middleware[].extra_args | dict | no | {} | Additional keyword arguments for the middleware constructor. | |
| events.ingestion.sources | list[object] | no | [] | Registered webhook sources. | |
| events.ingestion.sources[].id | string | yes | — | Unique source identifier. | |
| events.ingestion.sources[].validator.class | string | yes | — | Dotted import path for the validator class (HMAC, bearer, mTLS, etc.). | |
| events.ingestion.sources[].validator.secret_ref | string | no | — | Secret reference for the validator (e.g. HMAC key). | |
| events.ingestion.sources[].validator.extra_args | dict | no | {} | Additional keyword arguments for the validator. | |
| events.ingestion.sources[].allowed_types | list[string] | no | [] | Signal types accepted from this source. | |
| events.schedules | list[object] | no | [] | Cron/interval schedule definitions. | |
| events.schedules[].id | string | yes | — | Unique schedule identifier. | |
| events.schedules[].cron | string | no | — | Cron expression (e.g. '0 7 * * 1-5' for weekday 07:00 UTC). Mutually exclusive with interval_seconds. | |
| events.schedules[].interval_seconds | int | no | — | Interval between runs in seconds. Mutually exclusive with cron. | |
| events.schedules[].trigger_id | string | yes | — | ID of the trigger this schedule fires (must reference a trigger with signal: cron). | |
| events.schedules[].identity | object | object | object | yes | — | Identity claim for the scheduled run. | |
| events.schedules[].enabled | boolean | no | true | Whether this schedule is active. | |
| events.triggers | list[object] | no | [] | Trigger definitions that map signals to agent activations. | |
| events.triggers[].id | string | yes | — | Unique trigger identifier. | |
| events.triggers[].on.signal | string | yes | — | Signal name to match ('cron' reserved for time-driven triggers). | |
| events.triggers[].on.cron | string | no | — | Cron expression (required when signal='cron', rejected otherwise). | |
| events.triggers[].on.when | string | no | — | JMESPath boolean expression for conditional matching. | |
| events.triggers[].emits.agent | string | yes | — | Agent to activate when this trigger fires. | |
| events.triggers[].emits.prompt_template | string | yes | — | Prompt template sent to the agent at activation. | |
| events.triggers[].emits.identity | object | object | object | yes | — | Identity claim for the triggered run. | |
| events.triggers[].emits.respect_chat_binding | boolean | no | false | Respect chat_binding from the signal envelope (requires non-service-account identity). | |
| events.triggers[].emits.proactive_chat | boolean | no | false | Create a new chat for the resolved user (requires non-service-account identity). | |
| events.triggers[].emits.visibility | "actor" | "addressed" | "tenant" | "admin" | no | — | Visibility override: 'actor', 'addressed', 'tenant', 'admin'. | |
| events.triggers[].retry.max | int | no | 0 | Maximum retry attempts (0 = no retry). | |
| events.triggers[].retry.backoff | "fixed" | "linear" | "exponential" | no | exponential | Retry backoff strategy: 'fixed', 'linear', or 'exponential'. | |
| events.triggers[].retry.jitter | boolean | no | true | Add jitter to backoff timing. | |
| events.triggers[].retry.initial_delay_seconds | float | no | 1 | Initial delay before first retry (seconds). | |
| events.triggers[].retry.max_delay_seconds | float | no | 300 | Maximum delay between retries (seconds). | |
| events.triggers[].parallelism | "per_user" | "per_tenant" | "unbounded" | no | per_user | Parallelism mode: 'per_user', 'per_tenant', or 'unbounded'. | |
| config_storage.enabled | boolean | no | false | Enable database-backed agent configuration store (PostgreSQL CRUD for agent definitions). | |
| config_storage.class | string | no | "" | Dotted import path to an OrchidConfigStorage subclass for agent config persistence. | |
| config_storage.dsn | string | no | "" | Database connection string for agent config storage (PostgreSQL URL). |
Load Modes
| Mode | Root File | Agent Configs | Detection |
|---|---|---|---|
| YAML | orchid.yml | agents.yaml | .yml or .yaml extension |
| MD | orchid.md | agents/*.md | .md extension |
| Hybrid | orchid.yml | agents/*.md | AGENTS_CONFIG_PATH points to a directory |
YAML Mode
ORCHID_CONFIG=orchid.yml uvicorn orchid_api.main:appMD Mode
ORCHID_CONFIG=orchid.md uvicorn orchid_api.main:appHybrid Mode
Keep orchid.yml for infrastructure while using agents/*.md for agents:
ORCHID_CONFIG=orchid.yml AGENTS_CONFIG_PATH=agents/ uvicorn orchid_api.main:appHot-Reload (MD Only)
The on-demand config watcher detects file changes via SHA-256 hashing — no background threads, no fs-notify libraries.
- The
OrchidConfigWatchertracksorchid.md+agents/*.mdby their current hashes. Orchid.reload_config()callswatcher.reload_if_changed()and rebuilds the LangGraph when a change is detected.- Graph rebuild is serialised via
asyncio.Lock— existing requests complete with the old config. - The API middleware (
ConfigReloadMiddleware) polls the watcher at most everyORCHID_RELOAD_INTERVALseconds (default 30, set to 0 to disable).
# Enable hot-reload with 10-second polling:
ORCHID_RELOAD_INTERVAL=10MD File Format
orchid.md — Root Config
Infrastructure keys (llm, auth, rag, storage) map to environment variables. Agent behavior keys (version, defaults, tools, skills, supervisor, guardrails, events) are validated against the Pydantic schema. The body after the closing --- is free-form documentation.
agents/<name>.md — Per-Agent Config
Each agent file has YAML frontmatter for structured fields and a Markdown body for the system prompt. The filename stem becomes the agent name.
| Agent MD field | YAML equivalent | Notes |
|---|---|---|
Frontmatter description | agents.<name>.description | Short, for supervisor routing |
| MD body | agents.<name>.prompt | Rich Markdown, stripped of leading/trailing whitespace |
Frontmatter class | agents.<name>.class | Aliased to class_path |
Frontmatter rag, tools, etc. | agents.<name>.rag, etc. | 1:1 mapping |
Filename basketball.md | Dict key basketball | Agent name from filename stem |
Equivalence
MD and YAML configs produce identical OrchidAgentsConfig output. Run the equivalence test:
cd orchid && .venv/bin/python examples/md-config/test_equivalence.py