Agents Configuration

Detailed reference for every agents.yaml property — defaults, supervisor, agents, tools, skills, guardrails, events.

Index

version

defaults

supervisor

agents[]

tools[]

skills[]

guardrails

mcp_gateway

events



defaults

Top-level defaults inherited by every agent unless explicitly overridden per-agent. Think of this as the global agent template.

defaults.llm

ShortDefault LLM settings for all agents.
DetailedEvery agent that does not specify its own llm block inherits these values. This is the most common place to set the primary model and fallback.
defaults.llm.model
ShortDefault LLM model for all agents.
DetailedLiteLLM provider/model-name format. Used for agent reasoning, tool calling, and summarisation when no per-agent override exists.
Defaultgemini/gemini-2.5-flash
Available valuesAny LiteLLM-compatible model string
defaults.llm.temperature
ShortSampling temperature.
DetailedControls randomness. Lower values (0.0–0.3) produce more deterministic, repeatable outputs. Higher values (0.7–1.0) increase creativity and variation. For tool-calling agents, keep this low to ensure consistent JSON formatting.
Default0.2
Available values0.0 to 2.0 (provider-dependent)

Tool-calling temperature

High temperatures cause malformed tool-call JSON and hallucinated function names. Keep temperature <= 0.3 for agents that rely on structured tool calls.

defaults.llm.fallback_model
ShortFallback model when the primary fails.
DetailedWhen the primary model returns a 503, rate-limit error, or timeout, Orchid automatically retries with this fallback model. The fallback is tried once per request.
Defaultnull
Available valuesAny LiteLLM-compatible model string, or null to disable

Always set in production

Always configure a fallback model in production. Pair a cloud-hosted primary with a local Ollama model so the service degrades gracefully during provider outages rather than returning errors to users.

defaults.llm.retry_attempts
ShortRetry count on transient LLM errors.
DetailedWhen > 0, transient errors (network timeouts, 5xx responses) are retried with exponential backoff. 0 means no automatic retry — the error surfaces to the user immediately.
Default0
Available values0 or any positive integer
defaults:
  llm:
    model: gemini/gemini-2.5-flash
    temperature: 0.2
    fallback_model: ollama/llama3.2
    retry_attempts: 2
---
defaults:
  llm:
    model: openai/gpt-4o
    temperature: 0.1
    fallback_model: groq/llama-3.3-70b-versatile
---

defaults.rag

ShortDefault RAG settings for all agents.
DetailedRetrieval-Augmented Generation configuration that applies globally unless overridden per-agent.
defaults.rag.k
ShortNumber of chunks retrieved per query.
DetailedThe top-k chunks returned by vector similarity search. Higher values surface more documents but increase prompt size and cost. Lower values improve precision but may miss relevant context.
Default5

Tuning retrieval count

Raise k for knowledge-base agents that must surface multiple relevant documents per query (e.g. catalog search, document Q&A). Lower it for precision-focused agents to keep prompts concise and reduce hallucination from noisy context.

defaults.rag.enabled
ShortEnable RAG context retrieval.
DetailedMaster switch. When disabled, no vector retrieval runs and no RAG context is injected into prompts. Useful for agents that rely purely on tools or static prompts.
Defaulttrue
Available valuestrue, false
defaults.rag.rag_ttl
ShortCache TTL for RAG results in seconds.
DetailedWhen > 0, repeated queries within the TTL window reuse the previous retrieval result without hitting the vector database. 0 disables caching — every query triggers a fresh retrieval.
Default0

Cache freshness vs cost

Set a non-zero TTL (e.g. 300–600 seconds) for agents with stable knowledge bases to reduce Qdrant load and latency. Set to 0 for real-time data agents where documents change frequently.

defaults.rag.max_context_chars
ShortMaximum characters of RAG context injected into prompts.
DetailedA hard cap on the RAG context block size. Even if k retrieves 10 chunks, the total injected text is truncated to this limit. Prevents oversized prompts from consuming the LLM's context window.
Default3000

defaults.rag.ingestion

Document ingestion settings that control how documents are split and processed before embedding.

defaults.rag.ingestion.strategy
ShortChunking strategy name.
DetailedThe algorithm used to split documents into chunks before embedding. Each strategy balances semantic coherence with retrieval granularity differently.
Defaultrecursive (inherited when null)
Available valuesrecursive, semantic, hierarchical, headered
  • recursive — Splits text recursively by separators (paragraphs, sentences, words) until chunks fit chunk_size. Best general-purpose choice.
  • semantic — Uses an embedding model to detect semantic boundaries and split at natural topic transitions. Higher quality but slower and more expensive.
  • hierarchical — Creates parent-child chunk relationships. Parent chunks provide broad context; child chunks enable precise retrieval. Requires parent_chunk_size > 0.
  • headered — Splits at Markdown/HTML headers, preserving document structure. Ideal for well-structured documentation.
defaults.rag.ingestion.chunk_size
ShortText chunk size in characters.
Default1000
defaults.rag.ingestion.chunk_overlap
ShortCharacter overlap between consecutive chunks.
Default200
defaults.rag.ingestion.parent_chunk_size
ShortParent chunk size for hierarchical chunking.
DetailedWhen > 0, enables hierarchical parent-child layout. Parent chunks are embedded separately and stored in chunk metadata. Child chunks are used for precise retrieval; parent chunks provide broader context.
Default0 (disabled)
defaults.rag.ingestion.parent_chunk_overlap
ShortOverlap for parent chunks.
Default200
defaults.rag.ingestion.post_processors
ShortPost-processing pipeline applied after chunking.
DetailedOrdered list of post-processor names that transform chunks after initial splitting.
Default[]
Available valuescontextual_headers, entity_extraction
  • contextual_headers — Prepends a contextually-aware header to each chunk describing what document it came from.
  • entity_extraction — Extracts named entities and relationships for GraphRAG. Requires retrieval.graph.enabled: true.
defaults:
  rag:
    ingestion:
      strategy: hierarchical
      chunk_size: 1000
      chunk_overlap: 200
      parent_chunk_size: 4000
      parent_chunk_overlap: 400
      post_processors:
        - contextual_headers

defaults.rag.retrieval

Query retrieval settings that control how user queries are transformed and matched against the vector store.

defaults.rag.retrieval.strategy
ShortRetrieval strategy name.
DetailedThe algorithm used to match queries against embedded chunks. Different strategies optimise for different query types and content characteristics.
Defaultsimple (inherited when null)
Available valuessimple, multi_query, hyde, hybrid, graph_rag
  • simple — Cosine similarity between query embedding and chunk embeddings. Fast, baseline quality.
  • multi_query — Generates multiple paraphrased versions of the query and retrieves for each, then deduplicates. Improves recall for ambiguous or paraphrased questions.
  • hyde — Generates a hypothetical answer to the query, embeds that answer, and retrieves chunks similar to the hypothetical answer. Excellent for dense technical knowledge where exact keyword matches are unreliable.
  • hybrid — Combines dense vector similarity with sparse keyword matching (BM25 or SPLADE). Best when exact keyword matches matter alongside semantic similarity.
  • graph_rag — Traverses an entity-relationship graph extracted during ingestion. Requires entity_extraction post-processor and graph.enabled: true.

Strategy selection

Start with simple. Switch to multi_query when single-query retrieval misses paraphrased or ambiguous questions. Use hyde for domains where hypothetical answers improve recall (dense technical knowledge). Use hybrid when exact keyword matches matter alongside semantic similarity.

defaults.rag.retrieval.query_transformers
ShortOrdered list of query transformer names.
DetailedPre-strategy and strategy-level transformers that rewrite or expand queries before retrieval. Pre-strategy transformers (e.g. reformulate) run at turn entry. Strategy-level transformers (e.g. multi_query, hyde, decompose) are forwarded to the active strategy.
Defaultnull (inherits from defaults, effectively [])
Available valuesreformulate, multi_query, hyde, decompose
defaults.rag.retrieval.metadata_filters
ShortMetadata filter expressions applied to all retrievals.
DetailedAn operator mini-language for filtering retrieved chunks by metadata fields. Supports equality, range, and boolean operators.
Default{}
defaults.rag.retrieval.exclude_dynamic
ShortExclude dynamically-injected tool output from retrieval.
DetailedWhen true, adds a dynamic: {"not": true} clause to prevent re-retrieving chunks that were dynamically injected by tool calls in previous turns. Prevents circular retrieval of tool-generated content.
Defaultfalse

defaults.rag.retrieval.hyde

HyDE-specific retrieval knobs.

defaults.rag.retrieval.hyde.n_hypothetical
ShortNumber of hypothetical answers generated per query.
DetailedClassic HyDE uses 1 hypothetical answer. Increasing this grows recall at the cost of additional LLM calls (one per hypothetical answer). Each hypothetical answer is embedded separately and retrieval results are merged.
Default1
defaults:
  rag:
    retrieval:
      strategy: hyde
      hyde:
        n_hypothetical: 3

defaults.rag.retrieval.hybrid

Hybrid retrieval knobs (sparse + dense).

defaults.rag.retrieval.hybrid.sparse_encoder
ShortSparse encoder type for keyword matching.
Defaultbm25
Available valuesbm25, splade
defaults.rag.retrieval.hybrid.sparse_weight
ShortWeight of the sparse signal in linear fusion.
DetailedOnly used when fusion is linear. 0.0 = pure dense vectors. 1.0 = pure sparse keywords. 0.4–0.5 is a good starting point for most domains.
Default0.4
defaults.rag.retrieval.hybrid.fusion
ShortFusion method for combining sparse and dense rankings.
Defaultrrf
Available valuesrrf, linear
  • rrf — Reciprocal Rank Fusion. Parameter-free. Ranks are combined as 1 / (k + rank) where k defaults to 60. Good default choice.
  • linear — Weighted linear combination using sparse_weight. More tunable but requires calibration.
defaults.rag.retrieval.hybrid.rrf_k
ShortRRF constant k.
DetailedThe constant used in the Reciprocal Rank Fusion formula. Default 60 follows Cormack et al. Lower values emphasise top-ranked documents more heavily.
Default60

defaults.rag.retrieval.graph

GraphRAG-specific retrieval knobs.

defaults.rag.retrieval.graph.enabled
ShortEnable graph entity extraction during ingestion.
DetailedWhen enabled, the entity_extraction post-processor extracts entities and relationships from chunks and builds a knowledge graph. This graph is then traversed at retrieval time.
Defaultfalse
defaults.rag.retrieval.graph.max_hops
ShortMaximum BFS depth from seed entities.
DetailedHow many relationship hops to traverse from each seed entity found in the query. Higher values surface more connected context but increase retrieval time and noise.
Default2
defaults.rag.retrieval.graph.fuse_with_vectors
ShortMerge graph context with vector hits.
DetailedWhen true, the retrieved subgraph is serialised and appended alongside standard vector retrieval results. When false, only graph context is returned (no vector hits).
Defaulttrue
defaults.rag.retrieval.graph.relation_filter
ShortRestrict graph traversal to specific edge labels.
DetailedWhen non-empty, only traverse edges with these relationship types. Useful for domain-specific graphs where only certain relation types are relevant.
Default[]
defaults:
  rag:
    retrieval:
      strategy: graph_rag
      graph:
        enabled: true
        max_hops: 3
        relation_filter:
          - works_for
          - manages

defaults.rag.retrieval.transformer_prompts

Override prompts for the built-in query transformers.

defaults.rag.retrieval.transformer_prompts.multi_query
ShortOverride prompt for the multi-query transformer.
DetailedReplaces the default prompt that asks the LLM to generate paraphrased query variants. Use this to tailor paraphrasing style to your domain.
Defaultnull (module-level default)
defaults.rag.retrieval.transformer_prompts.hyde
defaults.rag.retrieval.transformer_prompts.hyde.single
ShortHyDE prompt for a single hypothetical answer.
Defaultnull (module-level default)
defaults.rag.retrieval.transformer_prompts.hyde.multi
ShortHyDE prompt for multiple hypothetical answers.
DetailedUses a {'{n}'} placeholder that is replaced with the value of n_hypothetical.
Defaultnull (module-level default)
defaults.rag.retrieval.transformer_prompts.decompose
ShortOverride prompt for the decompose transformer.
DetailedReplaces the default prompt that breaks complex queries into sub-queries.
Defaultnull (module-level default)
defaults.rag.retrieval.transformer_prompts.reformulate
ShortOverride prompt for the reformulate transformer.
DetailedReplaces the default prompt that reformulates the query for better retrieval.
Defaultnull (module-level default)

defaults.cache_enabled

ShortEnable global in-memory LLM response cache.
DetailedActivates LangChain's InMemoryCache via set_llm_cache(). Identical prompts (same model, messages, temperature) return cached results without an LLM call. Cache lives for the process lifetime and is lost on restart.
Defaultfalse
Available valuestrue, false

Cache scope and invalidation

The cache key includes the full prompt text, model string, and temperature. Changing any of these invalidates the cache entry. There is no explicit cache invalidation API — restart the process to clear. Do not enable if your agents produce time-sensitive or user-specific outputs that must vary per call.

defaults:
  cache_enabled: true

supervisor

The supervisor is the central orchestrator that routes queries to agents, manages multi-turn conversation state, and synthesises final responses.

supervisor.assistant_name

ShortDisplay name for the AI assistant.
DetailedUsed in supervisor prompts and shown in the UI. Customise to match your product branding.
Default"AI assistant"
supervisor:
  assistant_name: "Orchid Helpdesk"
---
supervisor:
  assistant_name: "Acme Support Bot"
---

supervisor.fallback_model

ShortFallback LLM for the supervisor.
DetailedOverrides defaults.llm.fallback_model specifically for supervisor operations (routing, synthesis, sequential advance). Useful when the supervisor needs a different fallback than agents.
Defaultnull (inherits defaults.llm.fallback_model)

supervisor.streaming_enabled

ShortEnable SSE streaming for responses.
DetailedWhen enabled, the API returns text/event-stream responses with tokens arriving as they are generated. When disabled, responses are buffered and returned as complete JSON.
Defaulttrue
Available valuestrue, false

supervisor.routing_system_prompt

ShortCustom system prompt for the routing phase.
DetailedReplaces the default template that tells the supervisor how to classify queries and select agents. Use this to inject domain-specific routing instructions.
Defaultnull (built-in template)

supervisor.synthesis_system_prompt

ShortCustom system prompt for the synthesis phase.
DetailedReplaces the default template that tells the supervisor how to combine agent outputs into a coherent final response.
Defaultnull (built-in template)

supervisor.sequential_advance_prompt

ShortCustom handoff prompt for sequential multi-agent flows.
DetailedUsed when agents are chained sequentially (one after another). Replaces the default template that tells the supervisor how to pass state between agents.
Defaultnull (built-in template)

supervisor.history_max_turns

ShortMaximum conversation exchange pairs retained.
DetailedThe supervisor keeps the most recent N user/assistant exchange pairs in context (up to 2xN messages). Older turns are dropped or summarised depending on history_summary_enabled.
Default20

supervisor.history_max_chars

ShortMaximum characters per message before truncation.
DetailedIndividual messages longer than this are truncated with a ... suffix. Prevents a single oversized message from consuming the entire context window.
Default1000

supervisor.routing_model

ShortCheaper/faster LLM for routing and advance phases.
DetailedWhen set, the supervisor uses this model for routing decisions and sequential handoffs instead of the primary model. Saves cost and latency because routing requires less reasoning power than synthesis.
Defaultnull (uses supervisor's main model)

supervisor.history_summary_enabled

ShortEnable sliding-window summarization.
DetailedWhen enabled, conversation history beyond history_summary_recent_turns is compressed via a cheap LLM call into a summary. The summary plus the recent verbatim turns are sent to the model. Dramatically reduces token usage for long-running conversations.
Defaulttrue
Available valuestrue, false

When to enable

Enable for long-running chats with token-priced LLMs where context accumulates over many turns. Disable for short-form workflows where keeping the full verbatim history is cheaper than the summarization LLM call.

supervisor.history_summary_model

ShortModel used for history summarization.
DetailedA cheap, fast model is recommended for summarization (e.g. gemini/gemini-2.5-flash or an Ollama model). Falls back to the supervisor's main model when not set.
Defaultnull

supervisor.history_summary_recent_turns

ShortNumber of recent turns kept verbatim.
DetailedThe most recent N exchange pairs are kept in full text. Everything older is summarised. Set this high enough to preserve the immediate conversation context.
Default10

supervisor.skip_synthesis_when_single_agent

ShortSkip synthesis when only one agent ran.
DetailedWhen enabled (default), if exactly one agent produced a substantive text response, that text is returned directly without running the supervisor synthesis LLM call. Saves 5–15 seconds and one LLM call per single-agent turn.
Defaulttrue
Available valuestrue, false

When to disable

Leave enabled to save 5–15 s and one LLM call on every single-agent turn. Disable only if the supervisor must always rewrite or augment the agent's raw output regardless of routing.

supervisor:
  assistant_name: "Helpdesk Bot"
  history_max_turns: 30
  history_summary_enabled: true
  history_summary_model: ollama/llama3.2
  history_summary_recent_turns: 15
  skip_synthesis_when_single_agent: true

supervisor.memory

Conversation memory configuration — controls how past conversation context is summarized, persisted, and retrieved beyond the current LangGraph state. Three strategies available (see Chat Summarization).

ShortConversation memory strategy and configuration.
DetailedA nested block that controls incremental running summaries, structured JSON entity extraction, and Qdrant-backed semantic retrieval of past turns. Default is strategy: "none" (no memory, backward-compatible).
Default{strategy: "none", structured_output: true, ...}

supervisor.memory.strategy

ShortMemory strategy selection.
Detailednone — no memory (backward-compatible). running_summary — stateful incremental compression (avoids O(n²) re-compute). rag_augmented — adds Qdrant semantic retrieval of past turns on top of running summary.
Default"none"
Available values"none", "running_summary", "rag_augmented"

supervisor.memory.summary_recent_turns

ShortRecent turns kept verbatim when using memory-based summarization.
DetailedWhen memory is active, the most recent N exchange pairs are preserved in full text alongside the incremental summary. Independent of supervisor.history_summary_recent_turns.
Default10

supervisor.memory.summary_model

ShortLLM model for summary extension calls in the memory pipeline.
DetailedA cheap, fast model recommended (e.g. gemini/gemini-2.5-flash-lite). Falls back to supervisor.history_summary_model, then the supervisor's main model.
Defaultnull

supervisor.memory.summary_prompt

ShortCustom compression prompt.
DetailedWhen set, overrides the default compression/extension prompt used by the memory system. null uses the built-in defaults (structured JSON extraction or narrative compression depending on structured_output).
Defaultnull

supervisor.memory.persist_summary

ShortPersist running summaries to chat storage.
DetailedWhen true, summaries are stored in the conversation_summaries table (SQLite/PostgreSQL) for cross-invocation reuse. When false, summaries are computed fresh each turn (ephemeral, no disk write).
Defaulttrue
Available valuestrue, false

supervisor.memory.structured_output

ShortEnable structured JSON entity extraction in summaries.
DetailedWhen true, the LLM produces JSON with topics, entities, actions, decisions, questions, and preferences. Falls back to narrative-only on JSON parse failure. When false, produces a flat paragraph summary.
Defaulttrue
Available valuestrue, false

Entity deduplication

When structured_output: true, entities mentioned across multiple turns are automatically deduplicated by name. New details are appended to the existing entity record rather than creating duplicates.

supervisor.memory.rag_namespace

ShortQdrant namespace for conversation memory embeddings.
DetailedReserved namespace in Qdrant where conversation turns are stored as embeddings. Uses OrchidRAGScope for hierarchical tenant isolation. Only relevant when strategy: "rag_augmented".
Default"__memory__"

supervisor.memory.rag_k

ShortNumber of semantically relevant past turns to retrieve.
DetailedHow many past conversation turns to retrieve from Qdrant via semantic search on each new user query. Higher values surface more context at the cost of token budget. Only relevant when strategy: "rag_augmented".
Default5

supervisor.memory.rag_similarity_threshold

ShortMinimum similarity score for RAG-retrieved turns.
DetailedResults below this score are discarded. Range 0.0–1.0. Lower values include more turns (potentially noisy). Higher values are stricter. Only relevant when strategy: "rag_augmented".
Default0.5

supervisor.memory.store_turns

ShortAutomatically embed and store each conversation turn in Qdrant.
DetailedWhen true, each user message and assistant response is embedded and stored in the __memory__ Qdrant namespace for future retrieval. Only relevant when strategy: "rag_augmented".
Defaulttrue
Available valuestrue, false

supervisor.memory.truncation_strategy

ShortHow messages exceeding max_chars are truncated.
Detailedhardcontent[:max_chars] + "…" (current behavior). middle — keeps first 40% and last 40%, with …[truncated]… marker. llm — asks LLM to summarize; falls back to middle on failure. semantic — reserved for embedding-based selection; falls back to middle.
Default"hard"
Available values"hard", "middle", "llm", "semantic"

supervisor.memory.truncation_max_chars

ShortCharacter limit for message truncation.
DetailedIndividual messages longer than this are truncated using truncation_strategy. Overrides supervisor.history_max_chars when memory is enabled.
Default1000
# Full memory config example
supervisor:
  memory:
    strategy: "rag_augmented"
    summary_recent_turns: 10
    structured_output: true
    persist_summary: true
    rag_k: 5
    rag_similarity_threshold: 0.5
    store_turns: true
    truncation_strategy: "middle"
    truncation_max_chars: 1000

agents[]

Agent definitions. Each key becomes an agent name. The name is used for routing, logging, and namespace addressing.

agents[].name

ShortAgent name (set automatically).
DetailedThe dictionary key in YAML or the filename stem in Markdown mode. Read-only — set by the loader, not by the user.
Default""

agents[].description

ShortHuman-readable purpose for supervisor routing.
DetailedThe supervisor uses this description to decide whether to route a query to this agent. Be concise and specific: describe what the agent does and what types of queries it handles.
DefaultRequired
agents:
  basketball:
    description: "Answers questions about NBA players, teams, and statistics."
---
description: "Answers questions about NBA players, teams, and statistics."
---

agents[].prompt

ShortSystem prompt for the agent.
DetailedThe core instructions injected into the agent's agentic loop. In YAML this is a string (use `
DefaultRequired
agents:
  basketball:
    prompt: |
      You are a basketball expert. Use the provided tools to look up
      player stats, team rosters, and game schedules. Be concise.
---
# frontmatter goes here
---

You are a basketball expert. Use the provided tools to look up
player stats, team rosters, and game schedules. Be concise.

agents[].class

ShortDotted import path to a custom OrchidAgent subclass.
DetailedWhen omitted, the agent uses GenericAgent. Custom subclasses can override run(), summarise(), or add bespoke tool-call logic. The class is resolved at runtime via importlib.
Defaultnull (uses GenericAgent)
Available valuesAny dotted Python path to an OrchidAgent subclass
agents:
  support:
    class: myapp.agents.support.SupportAgent

agents[].parallel_tools

ShortDispatch independent tool calls in parallel.
DetailedWhen enabled, the agent partitions its tool_calls into a parallel batch (dispatched via asyncio.gather) and a sequential tail. Per-tool safety is resolved from parallel_safe on the tool config, MCP readOnlyHint, or the built-in tool registry.
Defaultfalse
Available valuestrue, false

Parallel safety

Enable when an agent consistently makes multiple independent read-only tool calls per turn. Keep disabled for write operations or any tool chain where order guarantees matter — parallel dispatch removes sequencing. Read-only tools with parallel_safe: true (or MCP readOnlyHint: true) run in parallel; all others run sequentially.

agents[].llm

Per-agent LLM override. Same structure as defaults.llm. When any field is set, it overrides the corresponding default.

agents:
  creative:
    llm:
      model: anthropic/claude-sonnet-4-20250514
      temperature: 0.8

agents[].rag

Per-agent RAG override. Same structure as defaults.rag. All fields cascade: unset fields inherit from defaults.rag.

agents:
  knowledge:
    rag:
      namespace: docs
      k: 10
      enabled: true
agents[].rag.namespace
ShortQdrant collection namespace for this agent.
DetailedDocuments indexed for this agent are stored in this namespace. Different agents can share a namespace (common knowledge base) or use separate ones (isolated domains).
Default"" (uses agent name as namespace)
agents[].rag.payload_indexes
ShortExplicit Qdrant payload index declarations.
DetailedMap of field_name -> qdrant_schema_type for metadata fields you want to filter on. Types: keyword, integer, float, bool, datetime, text, geo.
Default{}
agents:
  catalog:
    rag:
      payload_indexes:
        category: keyword
        price: float
        in_stock: bool

agents[].mcp_servers[]

MCP server connections for this agent. Each entry defines a remote tool provider.

agents[].mcp_servers[].name
ShortUnique identifier for this MCP server.
DefaultRequired
agents[].mcp_servers[].type
ShortServer type.
Defaultlocal
Available valueslocal, remote
agents[].mcp_servers[].transport
ShortTransport protocol.
Defaultstreamable_http
Available valuesstreamable_http, sse
agents[].mcp_servers[].url
ShortMCP server URL.
DetailedSupports ${ENV_VAR} interpolation for runtime configuration.
DefaultRequired
agents:
  sales:
    mcp_servers:
      - name: crm
        type: remote
        transport: streamable_http
        url: "${CRM_MCP_URL}"
agents[].mcp_servers[].auth
agents[].mcp_servers[].auth.mode
ShortAuthentication mode for this MCP server.
DetailedDetermines how the agent authenticates to the MCP server.
Defaultnone
Available valuesnone, passthrough, oauth
  • none — No authentication headers are sent. Use for local servers on private networks.
  • passthrough — Forwards the graph's OrchidAuthContext bearer token unchanged. Use when the MCP server trusts the same identity provider.
  • oauth — Per-user OAuth 2.0 via MCP 2025-03-26 spec. On the first 401, Orchid discovers the server's OAuth metadata (RFC 9728), fetches the authorization server metadata (RFC 8414), and performs dynamic client registration (RFC 7591). No client_id or client_secret lives in config — everything is discovered at runtime.

OAuth mode

Use none for local MCP servers that need no credentials. Use passthrough when the MCP server shares the same identity provider. Use oauth when each user must independently authorize the MCP server; Orchid discovers everything from the server's 401 response automatically.

agents[].mcp_servers[].tools
ShortTool allow-list or wildcard.
DetailedList of OrchidToolConfig entries defining which tools this agent may call. Use "*" or ["*"] to discover all tools at runtime. Individual tools can override parallel_safe, inject_to_rag, requires_approval, and rag settings.
Default[]
agents:
  sales:
    mcp_servers:
      - name: crm
        url: https://crm.example.com/mcp
        tools:
          - name: search_contacts
            inject_to_rag: true
            rag_ttl: 300
            requires_approval: false
            parallel_safe: true
          - name: delete_contact
            requires_approval: true
agents[].mcp_servers[].prompts
ShortMCP prompt names to load.
DetailedPre-configured prompts exposed by the MCP server that the agent can reference. Use "*" to discover all prompts at runtime.
Default[]
agents[].mcp_servers[].resources
ShortMCP resource URIs to load.
DetailedStatic resources (documents, schemas, etc.) exposed by the MCP server. Use "*" to discover all resources at runtime.
Default[]
agents[].mcp_servers[].tool_call_strategy
ShortHow tools from this server are dispatched.
DetailedStrategy name registered in the OrchidToolCallStrategy registry.
Defaultall
Available valuesall, sequential, llm_decides, or any custom registered strategy
  • all — Call every tool concurrently, collect all results.
  • sequential — Call tools in order, chaining previous_results forward.
  • llm_decides — Ask the LLM which tools to call and with what arguments. Falls back to all on failure.
agents[].mcp_servers[].discover_all_tools
ShortAuto-discovered flag for tools.
DetailedSet automatically by the wildcard validator when tools: "*" or tools: ["*"]. Do not set manually.
Defaultfalse
agents[].mcp_servers[].discover_all_prompts
ShortAuto-discovered flag for prompts.
Defaultfalse
agents[].mcp_servers[].discover_all_resources
ShortAuto-discovered flag for resources.
Defaultfalse

agents[].execution_hints

Hints for the supervisor when routing.

agents[].execution_hints.parallel_safe
ShortMark this agent safe to run in parallel.
DetailedHint to the supervisor that this agent has no side effects and can be dispatched concurrently with other agents in multi-agent flows.
Defaulttrue

agents[].tools

ShortBuilt-in tool names available to this agent.
DetailedMust match keys in the top-level tools: section or the built-in tool registry. These are Python functions invoked directly in-process (not via MCP).
Default[]

agents[].skills

ShortPer-agent skill definitions.
DetailedMulti-step workflows that this agent can execute. Each skill is a named sequence of tool calls or agent invocations. Unlike orchestrator-level skills (top-level skills:), these run within a single agent and do not involve the supervisor.
Default{}
agents:
  support:
    skills:
      escalate:
        description: "Escalate a ticket through the support hierarchy"
        steps:
          - tool: create_ticket
            arguments:
              priority: high
          - agent: manager
            instruction: "A high-priority ticket needs your attention"

agents[].guardrails

ShortPer-agent guardrail chains.
DetailedInput and output guardrails that apply only to this agent. In addition to any global guardrails defined at the top level.
Default{} (no guardrails)

agents[].children

ShortSub-agent configurations nested under this agent.
DetailedCreates a hierarchical agent tree with one level of nesting. The parent agent can route to its children. Children inherit defaults from the parent and can define their own overrides. The graph builder, MCP inventory, and auth registry only handle agents[].children[] — deeper nesting is not supported. Mini-agents are forbidden on child agents.
Defaultnull
agents:
  support:
    description: "Top-level support router"
    children:
      billing:
        description: "Handles billing questions"
        prompt: "You are a billing specialist..."
      technical:
        description: "Handles technical issues"
        prompt: "You are a technical support engineer..."

agents[].mini_agent

Opt-in mini-agent (self-clone) configuration. Only allowed on top-level agents.

agents[].mini_agent.enabled
ShortEnable mini-agent decomposition.
DetailedWhen enabled, complex requests are decomposed into independent sub-tasks, each handled by a cloned mini-agent running in parallel. The results are then aggregated into a final response. Adds one extra LLM call (decomposer) per turn.
Defaultfalse
Available valuestrue, false

Cost vs speed trade-off

Enable when a single complex user request can be decomposed into independent sub-tasks that do not share state. The decomposer adds one extra LLM call per turn; only opt in when the parallelism speedup outweighs that cost. Nesting is not supported — only top-level agents can enable mini-agents.

agents[].mini_agent.max_count
ShortMaximum number of parallel mini-agents.
DetailedThe decomposer produces at most this many sub-tasks. Range is enforced by Pydantic validation.
Default3
Available values2 to 8
agents[].mini_agent.decomposer_model
ShortLLM model for the decomposer step.
DetailedA cheap, fast model is recommended. Falls back to the parent agent's llm.model when not set.
Defaultnull
agents[].mini_agent.timeout_seconds
ShortHard timeout per mini-agent.
DetailedIf a mini-agent does not complete within this time, it is cancelled and its result is omitted from aggregation.
Default60
Available values5 to 600
agents[].mini_agent.tool_allowlist_mode
ShortTool exposure mode for mini-agents.
DetailedControls which tools cloned mini-agents may access.
Defaultstrict
Available valuesstrict, parent_full, inferred
  • strict — Every tool name in allowed_tools must exist in the parent's inventory. Fails validation if a name is unknown.
  • parent_full — Ignores allowed_tools entirely. Mini-agents get access to the parent's full tool set.
  • inferredstrict behaviour, but an empty allowed_tools falls back to the parent's full inventory with a warning.
agents[].mini_agent.stream_inner_tokens
ShortStream mini-agent tokens to SSE.
DetailedWhen enabled, tokens generated by inner mini-agents propagate to the SSE stream. When disabled, only lifecycle events (start, complete, error) surface.
Defaultfalse
agents[].mini_agent.decomposer_prompt
ShortCustom decomposer prompt.
DetailedReplaces the built-in template that instructs the LLM how to break a query into sub-tasks.
Defaultnull
agents[].mini_agent.aggregator_prompt
ShortCustom aggregator prompt.
DetailedReplaces the built-in template that instructs the LLM how to combine mini-agent results into a final response.
Defaultnull
agents[].mini_agent.system_prompt_template
ShortTemplate for each mini-agent's system prompt.
DetailedSupports placeholders: {parent_prompt} (the parent agent's prompt), {instruction} (the decomposed sub-task), {tool_list} (available tools).
Defaultnull
agents:
  research:
    mini_agent:
      enabled: true
      max_count: 5
      timeout_seconds: 45
      tool_allowlist_mode: parent_full
      stream_inner_tokens: true

agents[].prompt_sections

Customisable templates for the agentic-loop system prompt assembly.

agents[].prompt_sections.prior_results_header
ShortHeader for prior-turn tool results.
Default`"
--- Previous Tool Results (from prior turns) ---"`
agents[].prompt_sections.mcp_prompt_template
ShortTemplate for rendered MCP prompts.
DetailedPlaceholders: {name}, {text}.
Default`"
--- MCP Prompt: {name} ---
{text}"`
agents[].prompt_sections.skipped_prompt_template
ShortTemplate for MCP prompts requiring arguments.
DetailedShown when a prompt requires arguments that were not provided. Placeholders: {name}, {description}, {required_args}.
Default`"
[Available prompt: {name}] {description} (requires: {required_args})"`
agents[].prompt_sections.resources_header
ShortHeader for MCP resources block.
Default`"
--- Available Resources ---"`
agents[].prompt_sections.resource_template
ShortTemplate for each MCP resource.
DetailedPlaceholders: {name}, {content}.
Default`"
[{name}]
{content}"`
agents[].prompt_sections.rag_header
ShortHeader for RAG context block.
Default`"
--- Background Knowledge (RAG) ---"`
agents[].prompt_sections.prior_results_max_chars
ShortCharacter cap on prior tool-results JSON.
Default4000
agents[].prompt_sections.resource_max_chars
ShortCharacter cap per MCP resource body.
Default2000
agents[].prompt_sections.summarise_history_reminder
ShortReminder block for summarise prompt when history is present.
DefaultBuilt-in reminder text
agents[].prompt_sections.summarise_prior_results_header
ShortHeader for prior results in summarise prompt.
Default`"

--- Previous Tool Results (from prior turns) --- "` |

agents[].prompt_sections.summarise_rag_section_header
ShortHeader for RAG block in summarise user message.
Default`"Background knowledge (from RAG):
"`
agents[].prompt_sections.summarise_user_template
ShortUser-content template for summarise call.
DetailedPlaceholders: {query}, {rag_section}, {mcp_data}.
Default`"User query: {query}

{rag_section}Live data (from API): {mcp_data}"` |

agents[].prompt_sections.summarise_prior_results_max_chars
ShortMax characters of prior results in summarise prompt.
Default4000

tools[]

Global built-in tool declarations. Each key is a tool name; the value defines its handler and metadata.

tools[].handler

ShortDotted import path to the Python function.
DetailedThe function implementing this tool. Auto-extracted signature is used for parameter schema unless parameters is explicitly defined. Sync handlers are automatically wrapped with asyncio.to_thread.
DefaultRequired

tools[].description

ShortTool description for LLM invocation.
DetailedShown to the LLM in the tool schema. Be specific about what the tool does, what inputs it expects, and what it returns.
Default""

tools[].parameters

ShortParameter declarations.
DetailedWhen omitted, parameters are auto-extracted from the Python function signature. Framework-injected params (query, context, auth_context, **kwargs) are filtered out. YAML declarations take precedence over auto-extraction.
Default{}
tools[].parameters[].type
ShortParameter type.
Defaultstring
Available valuesstring, int, float, bool
tools[].parameters[].description
ShortParameter description.
Default""
tools[].parameters[].required
ShortWhether the parameter is required.
Defaulttrue
tools[].parameters[].default
ShortDefault value when not provided.
Defaultnull

tools[].inject_to_rag

ShortStore this tool's results in RAG.
DetailedWhen enabled, the tool's output is embedded and stored in the agent's RAG namespace. Subsequent queries can retrieve this output as context.
Defaultfalse

tools[].rag_ttl

ShortCache TTL for RAG-stored tool results.
DetailedHow long (in seconds) the tool's RAG-injected output remains retrievable. null uses the agent's default rag_ttl.
Defaultnull

tools[].requires_approval

ShortRequire human approval before execution.
DetailedWhen enabled, the tool call is paused and a HITL (human-in-the-loop) request is sent to the frontend. The user must approve before the tool executes.
Defaultfalse

tools[].parallel_safe

ShortDeclare the tool safe for parallel dispatch.
DetailedFor built-in tools, null resolves to false (sequential). Set true for pure read-only, side-effect-free handlers. Only consulted when the agent has parallel_tools: true.
Defaultnull
tools:
  format_date:
    handler: myapp.tools.dates.format_date
    description: "Format a date string into a human-readable form"
    parameters:
      date_str:
        type: string
        description: "ISO 8601 date string"
        required: true
      format:
        type: string
        description: "Output format (e.g. 'long', 'short')"
        required: false
        default: long
    inject_to_rag: false
    requires_approval: false
    parallel_safe: true

skills[]

Orchestrator-level (cross-agent) skill definitions. These are multi-step workflows that can invoke multiple agents in sequence.

skills[].description

ShortHuman-readable skill purpose.
Default""

skills[].steps

ShortOrdered list of agent invocations.
DetailedEach step names an agent and provides an instruction. The supervisor routes each step through the named agent, passing the instruction as the query.
DefaultRequired
skills[].steps[].agent
ShortAgent name to invoke.
DefaultRequired
skills[].steps[].instruction
ShortHint passed to the agent.
DetailedThe instruction is sent to the agent as the user query for that step. Can reference prior step results via template variables in future versions.
Default""
skills:
  onboarding:
    description: "Walk a new user through account setup"
    steps:
      - agent: greeter
        instruction: "Welcome the user and explain what we do"
      - agent: account_setup
        instruction: "Guide the user through creating their profile"
      - agent: preferences
        instruction: "Ask about notification and privacy preferences"

guardrails

Global guardrail chains applied to every request. Per-agent guardrails can augment or override these.

guardrails.input

ShortInput guardrail rules.
DetailedApplied to the user's raw query before any agent processes it. Chains are evaluated in order; the first failing rule triggers its fail_action.
Default[]

guardrails.output

ShortOutput guardrail rules.
DetailedApplied to agent responses before delivery to the user.
Default[]
guardrails.input[].type / guardrails.output[].type
ShortGuardrail type name.
DetailedMust match a registered guardrail implementation. Built-ins include content_safety, pii_detection, prompt_injection, max_length, topic_restriction.
DefaultRequired
guardrails.input[].fail_action / guardrails.output[].fail_action
ShortAction on guardrail failure.
Defaultblock
Available valuesblock, warn, redact, log
  • block — Reject the message entirely.
  • warn — Allow but append a warning.
  • redact — Mask sensitive content.
  • log — Record the violation but take no action.
guardrails.input[].config / guardrails.output[].config
ShortGuardrail constructor kwargs.
DetailedPassed as keyword arguments to the guardrail class constructor. Schema depends on the guardrail type.
Default{}
guardrails:
  input:
    - type: content_safety
      fail_action: block
      config:
        threshold: 0.8
    - type: prompt_injection
      fail_action: block
  output:
    - type: pii_detection
      fail_action: redact

mcp_gateway

MCP gateway exposure configuration. Controls how Orchid exposes its own capabilities to upstream MCP hosts.

mcp_gateway.tools

ShortTool title/description overrides.
DetailedMap of canonical tool name -> override config. Used to customise how tools appear to upstream MCP hosts without changing the underlying tool implementation.
Default{}
mcp_gateway.tools[].title
ShortOverride title for the tool.
Defaultnull (keeps gateway default)
mcp_gateway.tools[].description
ShortOverride description for the tool.
Defaultnull (keeps gateway default)

mcp_gateway.prompts

ShortMCP prompt templates exposed by the gateway.
DetailedPre-canned prompts that upstream hosts can request. Each prompt has a handle, optional arguments, and a template body.
Default[]
mcp_gateway.prompts[].name
ShortUnique prompt handle.
DetailedMust match ^[a-zA-Z_][a-zA-Z0-9_-]*$. Used by upstream hosts to reference the prompt.
DefaultRequired
mcp_gateway.prompts[].title
ShortDisplay title.
Defaultnull
mcp_gateway.prompts[].description
ShortPrompt description.
Defaultnull
mcp_gateway.prompts[].arguments
ShortArguments accepted by the prompt.
Default[]
mcp_gateway.prompts[].arguments[].name
ShortArgument name.
DetailedMust match ^[a-zA-Z_][a-zA-Z0-9_-]*$.
DefaultRequired
mcp_gateway.prompts[].arguments[].description
ShortArgument description.
Defaultnull
mcp_gateway.prompts[].arguments[].required
ShortWhether the argument is required.
Defaultfalse
mcp_gateway.prompts[].template
ShortPrompt body template.
DetailedUses {{arg_name}} syntax for argument substitution. Rendered at request time with the provided arguments.
DefaultRequired
mcp_gateway:
  tools:
    orchid_ask:
      title: "Ask Orchid"
      description: "Send a question to the Orchid multi-agent system"
  prompts:
    - name: summarise_thread
      title: "Summarise Conversation"
      description: "Produces a bullet-point summary of the current chat thread"
      arguments:
        - name: max_points
          description: "Maximum bullet points"
          required: false
      template: |
        Summarise the following conversation in at most {{max_points}} bullet points.
        Focus on decisions made and action items.

events

Pollen + Bloom event-driven activation layer. null or absent = disabled (zero overhead).

events.enabled

ShortMaster switch for the event layer.
DetailedWhen false, no producers, processors, queues, or schedulers are started. The event system has zero runtime cost when disabled.
Defaultfalse
Available valuestrue, false

Zero-cost when disabled

When events.enabled is false (or the events key is absent), no background tasks, threads, or connections are created. There is absolutely no runtime overhead from the event system when it is not in use.

events.store

ShortEvent storage backend.
DetailedRequired when enabled: true. Stores event state, trigger history, and schedule metadata.
Defaultnull
events.store.class
ShortDotted import path for the event store.
DefaultRequired when enabled: true
events.store.extra_args
ShortAdditional constructor kwargs.
Default{}

events.queue

ShortSignal queue backend configuration.
DetailedRequired when enabled: true. Buffers signals between producers and processors.
Defaultnull
events.queue.class
ShortDotted import path for the queue backend.
DefaultRequired when enabled: true
events.queue.notify_enabled
ShortEnable queue notifications.
Defaulttrue
events.queue.poll_interval_ms
ShortPoll interval in milliseconds.
Default200
Minimum10
events.queue.lease_seconds
ShortMessage lease duration.
DetailedHow long a processor has exclusive access to a message before it becomes available for re-processing.
Default30
Minimum1
events.queue.max_attempts
ShortMaximum processing attempts.
Default5
Minimum1
events.queue.dead_letter_table
ShortDead letter table name.
DetailedMessages that exceed max_attempts are moved to this table for later inspection.
Defaultsignal_queue_dead_letter

events.scheduler

ShortScheduler backend for cron-based triggers.
DetailedOptional. Required only if you use cron schedules. Typically an APScheduler wrapper.
Defaultnull
events.scheduler.class
ShortDotted import path for the scheduler.
DefaultRequired when schedules are used
events.scheduler.extra_args
ShortAdditional constructor kwargs.
Default{}

events.producers

ShortSignal producer configurations.
DetailedProducers emit signals into the queue. Each producer runs independently and may poll external systems (webhooks, message buses, file watchers).
Default[]
events.producers[].class
ShortDotted import path for the producer.
DefaultRequired
events.producers[].extra_args
ShortAdditional constructor kwargs.
Default{}

events.processors

ShortSignal processor configurations.
DetailedRequired when enabled: true. Workers that consume signals from the queue and execute triggers.
Default[]
events.processors[].class
ShortDotted import path for the processor.
DefaultRequired
events.processors[].concurrency
ShortWorker concurrency.
Default4
Minimum1
events.processors[].poll_interval_ms
ShortProcessor poll interval.
Default200
Minimum10
events.processors[].lease_seconds
ShortMessage lease duration.
Default30
Minimum1
events.processors[].max_attempts
ShortMaximum processing attempts.
Default5
Minimum1
events.processors[].drain_timeout_seconds
ShortDrain timeout on shutdown.
DetailedHow long to wait for in-flight messages to complete before force-stopping.
Default10.0
Minimum> 0

events.middleware

ShortProcessing middleware.
DetailedApplied to signals before they reach processors. Can transform, filter, or enrich signals.
Default[]
events.middleware[].class
ShortDotted import path for the middleware.
DefaultRequired
events.middleware[].extra_args
ShortAdditional constructor kwargs.
Default{}

events.ingestion

ShortWebhook source registry.
DetailedDefines valid inbound webhook sources with validation rules.
Default{}
events.ingestion.sources
ShortRegistered webhook sources.
Default[]
events.ingestion.sources[].id
ShortUnique source identifier.
DefaultRequired
events.ingestion.sources[].validator
ShortValidator configuration.
DetailedValidates incoming webhook signatures (HMAC, bearer token, mTLS).
DefaultRequired

####### events.ingestion.sources[].validator.class

ShortDotted import path for the validator.
DefaultRequired

####### events.ingestion.sources[].validator.secret_ref

ShortSecret reference.
Detailede.g. An HMAC key name or certificate thumbprint.
Defaultnull

####### events.ingestion.sources[].validator.extra_args

ShortAdditional constructor kwargs.
Default{}
events.ingestion.sources[].allowed_types
ShortSignal types accepted from this source.
DetailedEmpty list means all types are accepted.
Default[]

events.schedules

ShortCron/interval schedule definitions.
Default[]
events.schedules[].id
ShortUnique schedule identifier.
DefaultRequired
events.schedules[].cron
ShortCron expression.
DetailedStandard 5-field cron: min hour day month dow. Mutually exclusive with interval_seconds.
Defaultnull
events.schedules[].interval_seconds
ShortInterval between runs in seconds.
DetailedMutually exclusive with cron. Must be > 0.
Defaultnull
Minimum> 0
events.schedules[].trigger_id
ShortTarget trigger ID.
DetailedMust reference a trigger defined in events.triggers with signal: cron.
DefaultRequired
events.schedules[].identity
ShortIdentity claim for scheduled runs.
DetailedDiscriminated union on mode: service_account, addressed_to_user, or act_as_user.
DefaultRequired
events.schedules[].enabled
ShortWhether this schedule is active.
Defaulttrue

events.triggers

ShortTrigger definitions.
DetailedMap signals to agent activations. Each trigger has match conditions, emission configuration, and a retry policy.
Default[]
events.triggers[].id
ShortUnique trigger identifier.
DefaultRequired
events.triggers[].on
ShortMatch conditions.
DefaultRequired
events.triggers[].on.signal
ShortSignal name to match.
Detailed"cron" is reserved for time-driven triggers fired by schedules.
DefaultRequired
events.triggers[].on.cron
ShortCron expression for time-driven triggers.
DetailedRequired when signal == "cron". Rejected for non-cron signals.
Defaultnull
events.triggers[].on.when
ShortJMESPath boolean expression.
DetailedEvaluated against the signal envelope. Only matches when the expression returns true.
Defaultnull
events.triggers[].emits
ShortEmission configuration.
DefaultRequired
events.triggers[].emits.agent
ShortAgent to activate.
DefaultRequired
events.triggers[].emits.prompt_template
ShortPrompt template for the agent.
DetailedSent to the agent as the user query when the trigger fires. Can use template variables from the signal envelope.
DefaultRequired
events.triggers[].emits.identity
ShortIdentity claim.
DetailedDetermines who the trigger runs as. service_account runs as a system identity. addressed_to_user and act_as_user resolve a real user from the signal envelope.
DefaultRequired
events.triggers[].emits.respect_chat_binding
ShortRespect chat binding from signal.
DetailedWhen true, the trigger's output is appended to the chat specified in the signal envelope. Requires a non-service-account identity.
Defaultfalse
events.triggers[].emits.proactive_chat
ShortCreate a new chat for the user.
DetailedWhen true, a new chat session is created for the resolved user. Requires a non-service-account identity.
Defaultfalse
events.triggers[].emits.visibility
ShortVisibility override.
Detailedactor = visible to the triggering user only. addressed = visible to addressed users. tenant = visible to all users in the tenant. admin = admin-only. null = computed from identity mode.
Defaultnull
events.triggers[].retry
ShortRetry policy.
Default{}
events.triggers[].retry.max
ShortMaximum retry attempts.
Default0
Minimum0
events.triggers[].retry.backoff
ShortBackoff strategy.
Defaultexponential
Available valuesfixed, linear, exponential
events.triggers[].retry.jitter
ShortAdd jitter to backoff.
Defaulttrue
events.triggers[].retry.initial_delay_seconds
ShortInitial delay before first retry.
Default1.0
Minimum> 0
events.triggers[].retry.max_delay_seconds
ShortMaximum delay between retries.
Default300.0
Minimum> 0, must be >= initial_delay_seconds
events.triggers[].parallelism
ShortConcurrency scope.
DetailedControls how many concurrent executions of this trigger are allowed.
Defaultper_user
Available valuesper_user, per_tenant, unbounded
  • per_user — One concurrent execution per user.
  • per_tenant — One concurrent execution per tenant.
  • unbounded — No concurrency limit.
events:
  enabled: true
  store:
    class: orchid_ai.events.stores.sqlite.SQLiteEventStore
  queue:
    class: orchid_ai.events.queues.sqlite.SQLiteSignalQueue
  processors:
    - class: orchid_ai.events.processors.default.DefaultProcessor
      concurrency: 4
  schedules:
    - id: daily_digest
      cron: "0 7 * * 1-5"
      trigger_id: morning_briefing
      identity:
        mode: service_account
        name: scheduler
  triggers:
    - id: morning_briefing
      on:
        signal: cron
      emits:
        agent: digest
        prompt_template: "Generate the morning briefing for {{user.name}}"
        identity:
          mode: addressed_to_user
          service_account: scheduler
          user_id_from: signal.user_id
      retry:
        max: 3
        backoff: exponential

Load Modes Summary

ModeRoot FileAgent ConfigsDetection
YAMLorchid.ymlagents.yaml.yml or .yaml extension
MDorchid.mdagents/*.md.md extension
Hybridorchid.ymlagents/*.mdAGENTS_CONFIG_PATH points to a directory

Hot-Reload (MD Only)

The on-demand config watcher detects file changes via SHA-256 hashing — no background threads, no fs-notify libraries.

  • OrchidConfigWatcher tracks orchid.md + agents/*.md by hash.
  • Orchid.reload_config() calls watcher.reload_if_changed() and rebuilds the graph.
  • Graph rebuild is serialised via asyncio.Lock — existing requests complete with the old config.
  • The API middleware polls at most every ORCHID_RELOAD_INTERVAL seconds (default 30, set to 0 to disable).
# Enable hot-reload with 10-second polling:
ORCHID_RELOAD_INTERVAL=10