| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149 |
- ---
- title: "Deployment to Production"
- description: "Step-by-step guide for deploying your agency in a production environment."
- icon: "rocket-launch"
- sidebarTitle: "Deploy"
- ---
- **Recommended:** Use the [Starter Template](/welcome/getting-started/starter-template) for production. It ships with FastAPI endpoints, auth, and a clean project layout.
- ## Required Environment Variables
- Before deploying, ensure these are set in your production environment:
- | Variable | Required | Description |
- |----------|----------|-------------|
- | `OPENAI_API_KEY` | Yes | Your OpenAI API key |
- | `APP_TOKEN` | Recommended | Authentication token for FastAPI endpoints |
- <Note>
- Thread persistence uses callbacks you define to store threads in any database you choose.
- </Note>
- <Note>
- This guide assumes you have already created an agency. If you haven't, check out the [Getting Started](/welcome/installation) guide.
- </Note>
- <Warning>
- Before deploying, ensure you have thoroughly tested all tools and agents. Run the test cases in each tool file and verify the agency works end-to-end using demo methods.
- </Warning>
- ## Deployment Process
- <Steps>
- <Step title="Step 1: Persist Conversation Threads" icon="message-dots">
- By default, every time you create a new `Agency()`, it starts a fresh conversation thread. In production, you usually need to resume prior conversations or handle multiple users.
- <Info>
- Persist the full conversation history for each chat, including user-facing turns and agent-to-agent handoffs.
- </Info>
- Chat persistence is handled through callback functions passed to the Agency constructor:
- ```python
- from agents import TResponseInputItem
- from agency_swarm import Agency
- def save_threads(messages: list[TResponseInputItem], chat_id: str) -> None:
- save_threads_to_db(chat_id, messages)
- def load_threads(chat_id: str) -> list[TResponseInputItem]:
- return load_threads_from_db(chat_id)
- agency = Agency(
- agent1,
- agent2,
- communication_flows=[(agent1, agent2)],
- load_threads_callback=lambda: load_threads(chat_id),
- save_threads_callback=lambda messages: save_threads(messages, chat_id),
- )
- ```
- <Warning>
- If you switch model providers for an existing saved chat, old tool/event items may no longer replay correctly. Start a new chat, or keep only `{role, content}` messages.
- </Warning>
- </Step>
- <Step title="Step 2: Configure FastAPI Endpoints" icon="diagram-project">
- Use FastAPI in one of two ways:
- - Single agency: call `agency.run_fastapi(...)` from an `Agency` instance.
- - Multiple agencies and/or standalone tools: use top-level `run_fastapi(agencies=..., tools=[...])`.
- <Info>
- There can be multiple agencies in one server, and each agency key becomes its own endpoint prefix.
- </Info>
- ```python
- from agency_swarm import Agency, Agent, function_tool, run_fastapi
- @function_tool
- def health_check() -> str:
- return "ok"
- def create_support_agency(load_threads_callback=None):
- support = Agent(name="Support", instructions="You are a support agent.")
- return Agency(
- support,
- name="support",
- load_threads_callback=load_threads_callback,
- )
- def create_sales_agency(load_threads_callback=None):
- sales = Agent(name="Sales", instructions="You are a sales agent.")
- return Agency(
- sales,
- name="sales",
- load_threads_callback=load_threads_callback,
- )
- run_fastapi(
- agencies={
- "support": create_support_agency,
- "sales": create_sales_agency,
- },
- tools=[health_check],
- app_token_env="APP_TOKEN",
- cors_origins=["https://your-app.example"],
- )
- ```
- <Note>
- `run_fastapi(agencies=...)` injects `load_threads_callback` per request (for `chat_history`) and does not inject `save_threads_callback`.
- If you need server-side persistence writes, wire that explicitly in your application flow.
- </Note>
- This creates separate agency endpoints plus tool endpoints, for example:
- - `/support/get_response` and `/support/get_response_stream`
- - `/sales/get_response` and `/sales/get_response_stream`
- - `/tool/health_check`
- FastAPI details:
- - [Setting Up FastAPI Endpoints](/additional-features/fastapi-integration#setting-up-fastapi-endpoints)
- - [Authentication](/additional-features/fastapi-integration#authentication)
- - [Implementation reference (multiple agencies and tools)](/additional-features/fastapi-integration#implementation-reference)
- - [API Usage Example](/additional-features/fastapi-integration#api-usage-example)
- If you need tools hosted separately from your agency service, expose tools as APIs and connect them with [OpenAPI schemas](/core-framework/tools/openapi-schemas), or use [MCP Integration](/core-framework/tools/mcp-integration).
- </Step>
- <Step title="Step 3: Deploy the Service" icon="rocket-launch">
- Use the [Starter Template](/welcome/getting-started/starter-template) as your production base. It already includes FastAPI wiring and deployment defaults.
- - Create a repo from the template
- - Set `OPENAI_API_KEY` and `APP_TOKEN`
- - Follow the template README to deploy
- If you are wiring your own server, see [FastAPI Integration](/additional-features/fastapi-integration) for endpoint and parameter details (`host`, `port`, `app_token_env`, `cors_origins`, `enable_agui`).
- </Step>
- </Steps>
|