---
title: "Deployment to Production"
description: "Step-by-step guide for deploying your agency in a production environment."
icon: "rocket-launch"
sidebarTitle: "Deploy"
---
**Recommended:** Use the [Starter Template](/welcome/getting-started/starter-template) for production. It ships with FastAPI endpoints, auth, and a clean project layout.
## Required Environment Variables
Before deploying, ensure these are set in your production environment:
| Variable | Required | Description |
|----------|----------|-------------|
| `OPENAI_API_KEY` | Yes | Your OpenAI API key |
| `APP_TOKEN` | Recommended | Authentication token for FastAPI endpoints |
Thread persistence uses callbacks you define to store threads in any database you choose.
This guide assumes you have already created an agency. If you haven't, check out the [Getting Started](/welcome/installation) guide.
Before deploying, ensure you have thoroughly tested all tools and agents. Run the test cases in each tool file and verify the agency works end-to-end using demo methods.
## Deployment Process
By default, every time you create a new `Agency()`, it starts a fresh conversation thread. In production, you usually need to resume prior conversations or handle multiple users.
Persist the full conversation history for each chat, including user-facing turns and agent-to-agent handoffs.
Chat persistence is handled through callback functions passed to the Agency constructor:
```python
from agents import TResponseInputItem
from agency_swarm import Agency
def save_threads(messages: list[TResponseInputItem], chat_id: str) -> None:
save_threads_to_db(chat_id, messages)
def load_threads(chat_id: str) -> list[TResponseInputItem]:
return load_threads_from_db(chat_id)
agency = Agency(
agent1,
agent2,
communication_flows=[(agent1, agent2)],
load_threads_callback=lambda: load_threads(chat_id),
save_threads_callback=lambda messages: save_threads(messages, chat_id),
)
```
If you switch model providers for an existing saved chat, old tool/event items may no longer replay correctly. Start a new chat, or keep only `{role, content}` messages.
Use FastAPI in one of two ways:
- Single agency: call `agency.run_fastapi(...)` from an `Agency` instance.
- Multiple agencies and/or standalone tools: use top-level `run_fastapi(agencies=..., tools=[...])`.
There can be multiple agencies in one server, and each agency key becomes its own endpoint prefix.
```python
from agency_swarm import Agency, Agent, function_tool, run_fastapi
@function_tool
def health_check() -> str:
return "ok"
def create_support_agency(load_threads_callback=None):
support = Agent(name="Support", instructions="You are a support agent.")
return Agency(
support,
name="support",
load_threads_callback=load_threads_callback,
)
def create_sales_agency(load_threads_callback=None):
sales = Agent(name="Sales", instructions="You are a sales agent.")
return Agency(
sales,
name="sales",
load_threads_callback=load_threads_callback,
)
run_fastapi(
agencies={
"support": create_support_agency,
"sales": create_sales_agency,
},
tools=[health_check],
app_token_env="APP_TOKEN",
cors_origins=["https://your-app.example"],
)
```
`run_fastapi(agencies=...)` injects `load_threads_callback` per request (for `chat_history`) and does not inject `save_threads_callback`.
If you need server-side persistence writes, wire that explicitly in your application flow.
This creates separate agency endpoints plus tool endpoints, for example:
- `/support/get_response` and `/support/get_response_stream`
- `/sales/get_response` and `/sales/get_response_stream`
- `/tool/health_check`
FastAPI details:
- [Setting Up FastAPI Endpoints](/additional-features/fastapi-integration#setting-up-fastapi-endpoints)
- [Authentication](/additional-features/fastapi-integration#authentication)
- [Implementation reference (multiple agencies and tools)](/additional-features/fastapi-integration#implementation-reference)
- [API Usage Example](/additional-features/fastapi-integration#api-usage-example)
If you need tools hosted separately from your agency service, expose tools as APIs and connect them with [OpenAPI schemas](/core-framework/tools/openapi-schemas), or use [MCP Integration](/core-framework/tools/mcp-integration).
Use the [Starter Template](/welcome/getting-started/starter-template) as your production base. It already includes FastAPI wiring and deployment defaults.
- Create a repo from the template
- Set `OPENAI_API_KEY` and `APP_TOKEN`
- Follow the template README to deploy
If you are wiring your own server, see [FastAPI Integration](/additional-features/fastapi-integration) for endpoint and parameter details (`host`, `port`, `app_token_env`, `cors_origins`, `enable_agui`).