orchid

The core Python library: ABCs, agents, graph builder, RAG, and persistence.

orchid-ai is the distributable Python library at the heart of the Orchid framework. It provides the abstract base classes, the LangGraph graph builder, the hierarchical RAG pipeline, and pluggable chat persistence — everything your agents and consumers need, with no API server, no CLI, and no vendor-specific code. Deploy it anywhere Python runs.

Installation

pip install orchid-ai

# Backends are separate plugin packages — install what you need:
pip install orchid-storage-postgres   # PostgreSQL chat storage + checkpointer
pip install orchid-rag-qdrant         # Qdrant vector + doc store
pip install orchid-rag-chroma         # ChromaDB on-disk vector store
pip install orchid-rag-neo4j          # Neo4j graph store

Public API: two tiers

The library exposes two distinct surfaces — and Orchid is the front door.

Run — the single facade Orchid bootstraps config, wires the graph, and runs turns. orchid-api, orchid-cli, and in-process integrators all route through it:

from orchid_ai import Orchid

async with Orchid.from_config_path("orchid.yml") as orchid:
    result = await orchid.invoke("Hello!", user_id="alice", tenant_id="acme")
    print(result.response)

Orchid owns construction (from_config_path / from_md_config), execution (invoke / stream / resume), lifecycle (close / reload_config / async with), and read-only accessors (graph, runtime, config, chat_repo, …).

Extend — subclass the ABCs and call the register_* hooks exported at the top level (OrchidAgent, OrchidIdentityResolver, OrchidChatStorage, OrchidVectorReader, the guardrail / ingestion / retrieval contracts, …). These are import-time extension points: you subclass or register them before an Orchid instance exists, so they sit alongside the facade rather than behind it.

Lower-level plumbing (the build_* factories, concrete built-in backends, LangChain adapters, observability handlers) is intentionally not re-exported at the top level — import it from its submodule when you need it:

from orchid_ai.checkpointing import build_checkpointer
from orchid_ai.rag.factory import build_reader
from orchid_ai.persistence.sqlite import OrchidSQLiteChatStorage

Auth is execution context, not graph state. It travels in the LangGraph RunnableConfig (config["configurable"]["auth_context"]), never in OrchidAgentState, so it is never written to a checkpoint. Inside a custom agent, read it via self._current_auth; entry points attach it with with_auth(...).

Module map

`core/`

Pure Python abstractions with zero external dependencies beyond the standard library. Every other module in the package depends on core/ — never the reverse. This is where the ABCs live: OrchidAgent, OrchidIdentityResolver, OrchidMCPClient, and the vector-store interfaces. See Agents and Persistence.

`agents/`

GenericAgent — the concrete, YAML-driven agent — and its three collaborators: SkillDetector (skill matching), MCPDispatcher (MCP tool routing), and SkillExecutor (multi-step skill pipelines). Most deployments never need to subclass GenericAgent; the YAML surface covers the common case. See Agents and Tool Strategies.

`graph/`

LangGraph wiring: build_graph() factory, the supervisor node, and state definitions. The graph is assembled once at startup from OrchidAgentsConfig and reused for every request. See Supervisor.

`config/`

YAML schema (OrchidAgentsConfig, OrchidAgentConfig, OrchidMCPServerConfig) validated by Pydantic, plus the tool-parameter registry. Use load_config(path) to load and validate agents.yaml. Also contains OrchidConfigStorage (ABC for database-backed agent configs), build_config_storage() factory, and OrchidConfigStorageConfig schema.

`rag/`

Five-level hierarchical RAG: scoping (OrchidRAGScope), indexer, embedding factory, and pluggable backends. The library ships with null + in-memory backends; Qdrant, ChromaDB, and Neo4j backends are separate plugin packages that auto-register via entry points. All vector access goes through OrchidVectorReader / OrchidVectorWriter / OrchidVectorStoreAdmin — never through backend-specific imports outside rag/backends/. See RAG.

`persistence/`

OrchidChatStorage ABC for chat and message CRUD, with built-in SQLite (OrchidSQLiteChatStorage) — the default, single-file persistent store that requires no external services. PostgreSQL storage (OrchidPostgresChatStorage), the checkpointer, and the visibility fragment are provided by the separate orchid-storage-postgres plugin. Also contains the outbound MCP OAuth token stores and the inbound gateway-state stores used by orchid-mcp. See Persistence.

`documents/`

File parsing pipeline: PDF (PyMuPDF), DOCX, XLSX, CSV, and image (via vision LLM). All parsers follow the parse-once pattern — call extract_text() once and pass the result to both the prompt builder and ingest_document(pre_extracted_text=...). See Document Parsing.

`mcp/`

StreamableHttpMCPClient — the concrete MCP client that implements the three auth modes (none, passthrough, oauth) and drives capability caching via OrchidSessionWarmer. See MCP.

Public ABCs

ABC	Module	Purpose
`OrchidAgent`	`core.agent`	Agent identity + `run()`, `summarise()`, `fetch_rag_context()`, `extract_conversation_history()`
`OrchidIdentityResolver`	`core.identity`	Bearer token → `OrchidAuthContext` (pluggable per deployment)
`OrchidMCPToolCaller`	`core.mcp`	Call MCP tools (narrow interface — tool handlers depend on this)
`OrchidMCPDiscoverable`	`core.mcp`	Discover MCP server capabilities (separate from tool-calling)
`OrchidVectorReader`	`core.repository`	Vector store retrieval (agents depend on this only)
`OrchidVectorWriter`	`core.repository`	Vector store indexing (indexers depend on this only)
`OrchidVectorStoreAdmin`	`core.repository`	Collection management (admin operations only)
`OrchidChatStorage`	`persistence.base`	Chat session + message CRUD
`OrchidConfigStorage`	`config.storage`	Database-backed agent config CRUD

Writing a custom agent

When YAML alone isn't enough, subclass OrchidAgent in your consumer project and reference it by dotted import path in agents.yaml:

from __future__ import annotations

from orchid_ai.core.agent import OrchidAgent
from orchid_ai.core.state import OrchidAgentState


class CatalogAgent(OrchidAgent):
    @property
    def name(self) -> str:
        return "catalog"

    @property
    def description(self) -> str:
        return "Searches the product catalog and answers inventory questions."

    async def run(self, state: OrchidAgentState) -> dict:
        query = self.extract_user_query(state)
        history = self.extract_conversation_history(state)

        # Retrieve relevant catalog chunks
        rag_context = await self.fetch_rag_context(query, scope=self._rag_scope(state))

        # Synthesise a response using the injected LLM
        response = await self.summarise(
            query,
            rag_data=rag_context,
            conversation_history=history,
        )
        return {"messages": [response]}

Wire it up in config:

agents:
catalog:
  class: myproject.agents.catalog.CatalogAgent
  description: "Searches the product catalog and answers inventory questions."
  prompt: "You are a catalog specialist..."
  rag_namespace: catalog

The framework resolves the class at startup via importlib. The agent inherits summarise(), fetch_rag_context(), extract_user_query(), and extract_conversation_history() from the base class — override only what you need.

core/ has zero external dependencies

orchid_ai/core/ imports only the Python standard library and langchain-core (for Document and message types). Concrete implementations — Qdrant, ChromaDB, asyncpg, litellm — live in rag/backends/, persistence/, agents/, and orchid-cli/orchid_cli/rag/ respectively. This boundary is enforced: adding any external import to core/ is an architectural bug.

LLM configuration

Orchid uses LangChain's BaseChatModel as its LLM abstraction. The build_chat_model(model_string) factory creates one from a LiteLLM-style model string, supporting OpenAI, Anthropic, Google Gemini, Groq, and Ollama out of the box. The model is configured per-agent:

agents:
search:
  llm:
    model: "openai/gpt-4o-mini"
    temperature: 0.1

See Multi-LLM support for the full provider matrix.

Running tests

cd orchid
source .venv/bin/activate
pytest tests/ -x
ruff check orchid_ai/
ruff format orchid_ai/