--- title: "Deployment to Production" description: "Step-by-step guide for deploying your agency in a production environment." icon: "rocket-launch" sidebarTitle: "Deploy" --- **Recommended:** Use the [Starter Template](/welcome/getting-started/starter-template) for production. It ships with FastAPI endpoints, auth, and a clean project layout. ## Required Environment Variables Before deploying, ensure these are set in your production environment: | Variable | Required | Description | |----------|----------|-------------| | `OPENAI_API_KEY` | Yes | Your OpenAI API key | | `APP_TOKEN` | Recommended | Authentication token for FastAPI endpoints | Thread persistence uses callbacks you define to store threads in any database you choose. This guide assumes you have already created an agency. If you haven't, check out the [Getting Started](/welcome/installation) guide. Before deploying, ensure you have thoroughly tested all tools and agents. Run the test cases in each tool file and verify the agency works end-to-end using demo methods. ## Deployment Process By default, every time you create a new `Agency()`, it starts a fresh conversation thread. In production, you usually need to resume prior conversations or handle multiple users. Persist the full conversation history for each chat, including user-facing turns and agent-to-agent handoffs. Chat persistence is handled through callback functions passed to the Agency constructor: ```python from agents import TResponseInputItem from agency_swarm import Agency def save_threads(messages: list[TResponseInputItem], chat_id: str) -> None: save_threads_to_db(chat_id, messages) def load_threads(chat_id: str) -> list[TResponseInputItem]: return load_threads_from_db(chat_id) agency = Agency( agent1, agent2, communication_flows=[(agent1, agent2)], load_threads_callback=lambda: load_threads(chat_id), save_threads_callback=lambda messages: save_threads(messages, chat_id), ) ``` If you switch model providers for an existing saved chat, old tool/event items may no longer replay correctly. Start a new chat, or keep only `{role, content}` messages. Use FastAPI in one of two ways: - Single agency: call `agency.run_fastapi(...)` from an `Agency` instance. - Multiple agencies and/or standalone tools: use top-level `run_fastapi(agencies=..., tools=[...])`. There can be multiple agencies in one server, and each agency key becomes its own endpoint prefix. ```python from agency_swarm import Agency, Agent, function_tool, run_fastapi @function_tool def health_check() -> str: return "ok" def create_support_agency(load_threads_callback=None): support = Agent(name="Support", instructions="You are a support agent.") return Agency( support, name="support", load_threads_callback=load_threads_callback, ) def create_sales_agency(load_threads_callback=None): sales = Agent(name="Sales", instructions="You are a sales agent.") return Agency( sales, name="sales", load_threads_callback=load_threads_callback, ) run_fastapi( agencies={ "support": create_support_agency, "sales": create_sales_agency, }, tools=[health_check], app_token_env="APP_TOKEN", cors_origins=["https://your-app.example"], ) ``` `run_fastapi(agencies=...)` injects `load_threads_callback` per request (for `chat_history`) and does not inject `save_threads_callback`. If you need server-side persistence writes, wire that explicitly in your application flow. This creates separate agency endpoints plus tool endpoints, for example: - `/support/get_response` and `/support/get_response_stream` - `/sales/get_response` and `/sales/get_response_stream` - `/tool/health_check` FastAPI details: - [Setting Up FastAPI Endpoints](/additional-features/fastapi-integration#setting-up-fastapi-endpoints) - [Authentication](/additional-features/fastapi-integration#authentication) - [Implementation reference (multiple agencies and tools)](/additional-features/fastapi-integration#implementation-reference) - [API Usage Example](/additional-features/fastapi-integration#api-usage-example) If you need tools hosted separately from your agency service, expose tools as APIs and connect them with [OpenAPI schemas](/core-framework/tools/openapi-schemas), or use [MCP Integration](/core-framework/tools/mcp-integration). Use the [Starter Template](/welcome/getting-started/starter-template) as your production base. It already includes FastAPI wiring and deployment defaults. - Create a repo from the template - Set `OPENAI_API_KEY` and `APP_TOKEN` - Follow the template README to deploy If you are wiring your own server, see [FastAPI Integration](/additional-features/fastapi-integration) for endpoint and parameter details (`host`, `port`, `app_token_env`, `cors_origins`, `enable_agui`).