# 大模型全生命周期管理平台

# PilotDeck
OpenBMB开源一站式大模型开发部署平台，全流程工具链助力大模型快速落地

## （一）项目简介
### 核心定位
本项目是OpenBMB推出的生产级大模型全生命周期管理平台，为开发者和企业提供从模型训练、微调、推理部署到性能监控、资源调度的一站式解决方案，解决大模型开发门槛高、部署复杂、运维困难等核心痛点，帮助企业零门槛构建和部署自己的大模型应用。

### 解决的痛点
- 大模型开发需要深厚的AI技术积累，普通开发者和中小企业难以快速上手
- 多模型管理混乱，缺乏统一的调度、监控和版本管理机制
- 部署流程繁琐，资源利用率低，推理性能优化难度大，运维成本高
- 不同框架和模型之间兼容性差，模型迁移和复用困难

### 核心优势
- **全流程一体化**：覆盖模型训练、参数高效微调、推理部署、性能监控、资源调度全环节，一个平台搞定大模型应用开发全流程
- **低代码可视化操作**：提供直观的Web管理界面，无需复杂编码，通过拖拽和配置即可完成模型部署和服务发布
- **全主流模型兼容**：原生支持Llama 2/3、Qwen 1.5/2、ChatGLM、Baichuan等数十种主流开源大模型，自动适配不同模型格式
- **高性能推理优化**：内置模型量化、剪枝、分布式推理、批处理等优化技术，推理速度提升3-10倍，大幅降低部署成本
- **企业级生产能力**：支持多租户隔离、细粒度权限管理、弹性扩缩容、日志审计和故障自动恢复，满足企业级生产环境需求

## （二）环境前置要求
- **操作系统**：Ubuntu 20.04+/CentOS 8+/Debian 11+（推荐Linux系统）
- **Python版本**：Python 3.9 - 3.11
- **软件依赖**：Git、Docker 20.10+、Docker Compose 2.0+、NVIDIA Container Toolkit（GPU环境必需）
- **硬件要求**：
  - 推荐配置：8核CPU + 32GB内存 + NVIDIA GPU（显存≥16GB，支持CUDA 11.8+）
  - 最低配置：4核CPU + 16GB内存（仅用于CPU推理和轻量级模型部署）

## （三）快速开始 / 安装部署
### 方式一：Docker Compose一键部署（推荐生产环境）
```bash
# 克隆仓库
git clone https://github.com/OpenBMB/PilotDeck.git
cd PilotDeck

# 复制并修改环境配置文件
cp .env.example .env
# 编辑.env文件，配置数据库、GPU资源和模型存储路径

# 启动所有服务
docker compose up -d
```
服务启动后，访问 `http://你的服务器IP:8000` 即可进入管理后台，默认账号密码为 `admin/admin123`。

### 方式二：源码本地部署（开发测试）
```bash
# 克隆仓库
git clone https://github.com/OpenBMB/PilotDeck.git
cd PilotDeck

# 创建并激活虚拟环境
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 安装依赖
pip install -r requirements.txt

# 初始化数据库
python init_db.py

# 启动前端和后端服务
npm install && npm run build
python main.py
```
本地访问 `http://localhost:8000` 即可使用。

## （四）基础使用示例
### 1. 导入模型
1. 登录管理后台，进入"模型管理"页面
2. 点击"导入模型"，选择模型来源（本地文件、Hugging Face、ModelScope）
3. 输入模型名称和版本，选择模型类型，点击"开始导入"
4. 等待模型导入完成后，即可在模型列表中查看

### 2. 创建微调任务
1. 进入"微调管理"页面，点击"新建微调任务"
2. 选择基础模型，上传训练数据集（支持JSON、CSV格式）
3. 配置微调参数（学习率、批次大小、训练轮数等）
4. 点击"开始训练"，系统自动执行微调任务，可实时查看训练进度和损失曲线

### 3. 部署推理服务
1. 进入"服务部署"页面，点击"新建服务"
2. 选择要部署的模型和版本，配置资源配额和并发数
3. 选择部署方式（单实例、分布式），点击"部署"
4. 等待服务启动完成后，系统会自动生成API调用地址和密钥

### 4. 调用推理API
```bash
curl -X POST http://你的服务器IP:8000/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer 你的API密钥" \
  -d '{
    "model": "qwen-7b-chat",
    "messages": [{"role": "user", "content": "你好，请介绍一下自己"}],
    "temperature": 0.7,
    "max_tokens": 512
  }'
```

## （五）开源许可证
本项目采用 **Apache License 2.0** 开源许可证，详细条款请参考项目根目录下的 LICENSE 文件。

<p align="center">
  <img src="assets/banner.png" alt="PilotDeck" width="680"/>
</p>

<p align="center">
  Task-oriented AI Agent productivity platform — redefining operational boundaries and memory evolution, one WorkSpace at a time.
</p>

<p align="center">
  <a href="https://pilotdeck.openbmb.cn"><img src="https://img.shields.io/badge/Website-pilotdeck.openbmb.cn-FF6B35?style=flat-square&logo=googlechrome&logoColor=white" alt="Official Website"/></a>
  <a href="https://pilotdeck.openbmb.cn/pilotdeck.github.io/demo/p/pilotdeck-demo"><img src="https://img.shields.io/badge/Demo-Live-brightgreen?style=flat-square" alt="Live Demo"/></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/License-AGPL_3.0-blue.svg?style=flat-square" alt="License"/></a>
  <a href="https://modelcontextprotocol.io/"><img src="https://img.shields.io/badge/MCP-Native-6366F1?style=flat-square" alt="MCP Native"/></a>
  <a href="https://github.com/OpenBMB/PilotDeck/stargazers"><img src="https://img.shields.io/github/stars/OpenBMB/PilotDeck?style=flat-square" alt="Stars"/></a>
  <br/>
  <a href="#-community"><img src="https://img.shields.io/badge/Discord-Join_Community-5865F2?style=for-the-badge&logo=discord&logoColor=white" alt="Discord"/></a>
  &nbsp;
  <a href="#-community"><img src="https://img.shields.io/badge/Feishu-Community-00D6B9?style=for-the-badge&logo=bytedance&logoColor=white" alt="Feishu"/></a>
  &nbsp;
  <a href="#-community"><img src="https://img.shields.io/badge/WeChat-Community-07C160?style=for-the-badge&logo=wechat&logoColor=white" alt="WeChat"/></a>
  <br/>
</p>

<p align="center">
  <b>English</b> | <a href="./README.zh.md">简体中文</a>
  <br/>
  <a href="https://pilotdeck.openbmb.cn">Website</a> · <a href="https://pilotdeck.openbmb.cn/pilotdeck.github.io/demo/p/pilotdeck-demo">Live Demo</a> · <a href="https://pilotdeck.openbmb.cn/pilotdeck.github.io/docs/en/introduction">Tutorial</a> · <a href="#-installation--quick-start">Quick Start</a> · <a href="#-key-highlights">Highlights</a> · <a href="#use-cases">Use Cases</a> · <a href="#-community">Community</a>
</p>

---

**News** 🔥

- **[2026.05.28]** PilotDeck is now open source! Visit our official website at [pilotdeck.openbmb.cn](https://pilotdeck.openbmb.cn). We welcome contributions, feedback, and stars from the community.

---

## 💡 About PilotDeck

**PilotDeck** is an open-source agent operating system designed around the concept of "WorkSpace". It is jointly developed and open-sourced by Tsinghua University [THUNLP](https://nlp.csai.tsinghua.edu.cn/), [ModelBest](https://modelbest.cn/), [OpenBMB](https://www.openbmb.cn/), and [AI9Stars](https://github.com/AI9Stars). Targeting general-purpose, multi-task scenarios, PilotDeck is built to be a true *productivity tool* for the Agent era.

A wave of excellent AI Agent harnesses has emerged in recent years, each with its own focus: **Claude Code / Cursor / Trae Solo** brought model reasoning deep into the programming IDE; **Claude Cowork** introduced the notion of project-level isolation to desktop-side knowledge work; **WorkBuddy** connected agents to IM ecosystems such as WeCom and Feishu so AI is one message away.

When we shift the lens from "one-shot programming" or "immediate Q&A" to **long-running, multi-project productivity work**, however, several questions remain open:

- When many projects run in parallel, can memory be **white-box and traceable**? When the AI gets something wrong, can you pinpoint which memory entry caused it and edit it directly — without starting a new chat from scratch?
- Can token cost be **tracked per task**, so that running agents in the background actually becomes economically viable?
- Can tasks of different difficulty **automatically be matched to different models**, instead of burning the flagship model on trivial calls?
- When you step away from the keyboard, can the work keep moving? Can the agent **proactively discover what's worth doing, report progress, and land results as files on disk**?

PilotDeck is an incremental exploration around exactly these questions. It uses the WorkSpace as the fundamental unit — completely isolating files, memory and skills per project — and pairs it with three pillar capabilities: **White-box Memory**, **Smart Routing** and **Always-on**. The entire system natively supports the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) and behaves consistently across front-ends (Web / CLI / IM).

### ✨ Key Highlights

<table width="100%">
<tr>
<td width="50%" valign="top">

**WorkSpace-Level Isolation & Accretion**

Every project gets its own file system, memory store and skill set. Parallel work no longer interferes with itself, retrieval has a bounded scope, and skills accrete naturally as each task grows — no more global context pollution.

<p align="center">
  <img src="assets/workspace_en.gif" width="100%" alt="WorkSpace isolation demo"/>
</p>

</td>
<td width="50%" valign="top">

**Traceable White-box Memory**

Memory generation, extraction, storage and retrieval are visible end-to-end. When the AI mis-remembers, you can pinpoint and fix the offending entry. Built-in **Dream Mode** consolidates memory in idle windows, and supports one-click rollback.

<p align="center">
  <img src="assets/memory_en.gif" width="100%" alt="White-box memory demo"/>
</p>

</td>
</tr>
<tr>
<td width="50%" valign="top">

**Smart Routing & Cost Optimization**

Task difficulty is auto-detected; complex calls go to flagship models (e.g. Claude 3.5 Sonnet / GPT-4o), simple ones drop to lighter models. Through on-device / cloud co-orchestration and precise matching, token spend shrinks dramatically without sacrificing quality.

<p align="center">
  <img src="assets/router.gif" width="100%" alt="Smart routing demo"/>
</p>

</td>
<td width="50%" valign="top">

**Always-on Background Execution**

PilotDeck breaks the "you ask, it answers" loop: after you sign off, the agent keeps discovering candidate tasks, running long-horizon monitors, and finally lands deliverables as local files with a summary report waiting for you.

<p align="center">
  <img src="assets/awo_en.gif" width="100%" alt="Always-on execution demo"/>
</p>

</td>
</tr>
</table>

### 📊 Real-world Numbers

The three pillar capabilities have shown clear advantages in production-grade workflows:

#### 1. Smart Routing — ~70% cost savings on social-media workloads

In Xiaohongshu-style social-media operations, enabling Smart Routing automatically demotes simple polishing / layout tasks to a sub-agent (e.g. Sonnet 4.5) and only invokes Opus 4.5 at planning checkpoints:

<table width="100%">
<tr>
<th width="22%" align="left">Setup</th>
<th width="48%" align="left">Model configuration</th>
<th width="15%" align="left">Cost</th>
<th width="15%" align="left">Multiplier</th>
</tr>
<tr>
<td><b>Smart Routing ON</b></td>
<td>Opus 4.5 (main) + Sonnet 4.5 (sub)</td>
<td><b>$2.83</b></td>
<td><b>1.1×</b></td>
</tr>
<tr>
<td>Smart Routing OFF</td>
<td>All Opus 4.5 (main + sub)</td>
<td>$12.58</td>
<td>5.0×</td>
</tr>
<tr>
<td>Monolithic</td>
<td>Single Opus 4.5 long-react (estimated)</td>
<td>$12.20</td>
<td>4.8×</td>
</tr>
<tr>
<td colspan="4"><img width="840" height="1" alt=""/></td>
</tr>
</table>

#### 2. Smart Routing — 1/6 the cost while beating frontier models on hard tasks

The research team benchmarked 7 complex tasks (multilingual podcast push, multi-source data reports, domain-specific literature review, codebase architecture docs, etc.). The "strong main + light sub" routing setup matches or beats the frontier single-model setup at a fraction of the cost:

<table width="100%">
<tr>
<th width="70%" align="left">Setting</th>
<th width="15%" align="left">Score</th>
<th width="15%" align="left">Cost</th>
</tr>
<tr>
<td>MiniMax-M2.7 single-agent</td>
<td>37.1</td>
<td>$1.90</td>
</tr>
<tr>
<td>Claude Sonnet 4.6 single-agent</td>
<td>69.1</td>
<td>$18.36</td>
</tr>
<tr>
<td><b>Sonnet 4.6 (main) + MiniMax-M2.7 (sub)</b></td>
<td><b>70.6</b></td>
<td><b>$3.15</b></td>
</tr>
<tr>
<td colspan="3"><img width="840" height="1" alt=""/></td>
</tr>
</table>

#### 3. White-box Memory — layout & tone never bleed across projects

In black-box agents, mixing tasks in a shared context pool inevitably pollutes memory. PilotDeck's WorkSpace-scoped white-box memory addresses this end-to-end:

<table width="100%">
<thead>
<tr>
  <th width="14%" align="left">Dimension</th>
  <th width="41%" align="left">Current AI Agents (black-box)</th>
  <th width="45%" align="left">PilotDeck (white-box)</th>
</tr>
</thead>
<tbody>
<tr>
  <td><b>Visibility</b></td>
  <td>You can't see what the AI remembers, only what it outputs</td>
  <td>View every memory entry: what was stored, when, and which WorkSpace</td>
</tr>
<tr>
  <td><b>Control</b></td>
  <td>Once written, memory can't be edited or removed</td>
  <td>Edit / delete entries, pin critical decisions so they don't drift</td>
</tr>
<tr>
  <td><b>Traceability</b></td>
  <td>When it goes wrong, you can't find the root cause</td>
  <td>Generation → extraction → storage → retrieval, all auditable</td>
</tr>
<tr>
  <td><b>Isolation</b></td>
  <td>One shared pool — projects bleed into each other</td>
  <td>Scoped per WorkSpace; A's memory never reaches B</td>
</tr>
<tr>
  <td><b>Reversible</b></td>
  <td>After compression, the original is gone</td>
  <td>Dream-mode supports <b>one-click rollback</b> to the prior state</td>
</tr>
</tbody>
</table>

---

## 🖥️ UI & Demo

PilotDeck ships an out-of-the-box Web UI with full WorkSpace management, white-box memory editing, and visualization of multi-agent collaboration.

### Use Cases

> All demos below are generated entirely by edge-side models via PilotDeck's Smart Routing — no cloud-side frontier model required.

#### Work Document Generation

> *"Survey the Chinese LLM application market and turn it into a formal HTML white paper."*

<table width="100%">
<tr>
<td width="50%" align="center"><b>Process</b></td>
<td width="50%" align="center"><b>Result</b></td>
</tr>
<tr>
<td><img src="assets/en/ppt_en.gif" width="100%"/></td>
<td><img src="assets/result/ppt_result_en.gif" width="100%"/></td>
</tr>
</table>

#### Mini-Game Development

> *"Walk me through building an iOS AR mini-game Ball Finder in Vibe Coding mode."*

<table width="100%">
<tr>
<td width="50%" align="center"><b>Process</b></td>
<td width="50%" align="center"><b>Result</b></td>
</tr>
<tr>
<td><img src="assets/en/iosgame_en.gif" width="100%"/></td>
<td align="center"><img src="assets/result/ios_game_result.gif" width="60%"/></td>
</tr>
</table>

#### AI Engineering Platform Development

> *"Build a low-code embedding fine-tuning platform from scratch."*

<table width="100%">
<tr>
<td width="50%" align="center"><b>Process</b></td>
<td width="50%" align="center"><b>Result</b></td>
</tr>
<tr>
<td><img src="assets/en/modeltraining_en.gif" width="100%"/></td>
<td><img src="assets/result/modeltrainingresult_en.gif" width="100%"/></td>
</tr>
</table>

#### Audio-Video Editing & Social Media Operations

> *"Push this English podcast to a global audience in Chinese / Japanese / French / Korean / Spanish / Arabic."*

<table width="100%">
<tr>
<td width="50%" align="center"><b>Process</b></td>
<td width="50%" align="center"><b>Result (with audio)</b></td>
</tr>
<tr>
<td><img src="assets/en/podcast_en.gif" width="100%"/></td>
<td>

https://github.com/user-attachments/assets/a7245467-ee3c-4939-a055-c56576ac56d1

</td>
</tr>
</table>

---

## 📦 Installation & Quick Start

We provide a one-line installer for macOS / Linux, plus a source-based workflow for developers.

### Option A: One-line install (recommended, macOS / Linux)

```bash
curl -fsSL https://raw.githubusercontent.com/OpenBMB/PilotDeck/main/install.sh | bash
```

The script auto-installs Node.js 22, clones the repo, installs dependencies, and builds the frontend. Once it finishes:

```bash
pilotdeck            # starts the server at http://localhost:3001
pilotdeck status     # check runtime status
```

### Option B: From source (for developers)

**1. Clone and install dependencies**

> This repo uses [Git LFS](https://git-lfs.com/) for large media assets. Make sure `git lfs` is installed before cloning.
> If you don't need the demo videos/GIFs, add `GIT_LFS_SKIP_SMUDGE=1` before `git clone` to skip downloading them.

```bash
git clone https://github.com/OpenBMB/PilotDeck.git
cd PilotDeck

npm install              # root deps (Gateway runtime)
cd ui && npm install     # UI deps
cd ..
```

**2. Configure a model provider**

PilotDeck reads `~/.pilotdeck/pilotdeck.yaml`. You can create it manually, let the bootstrap script generate one, **or just open the Web UI and configure providers visually in the settings panel.**
Supported protocols include OpenAI, Anthropic, DeepSeek, Qwen, Kimi, MiniMax and other OpenAI-compatible endpoints.

```yaml
schemaVersion: 1
agent:
  model: deepseek/deepseek-v4-pro
model:
  providers:
    deepseek:
      protocol: openai
      url: https://api.deepseek.com/v1
      apiKey: sk-your-api-key
```

**3. Start the services**

```bash
cd ui && npm run dev     # dev mode (HMR), visit http://localhost:5173
# or
cd ui && npm run start   # production mode, visit http://localhost:3001
```

### Option C: Docker Compose

If Docker is installed, you can start PilotDeck with:

```bash
docker compose up -d
```

---

## 🛠️ Extension Protocol

PilotDeck has an open plugin architecture with a strict boundary between the open-source core and plugin customization. Extending the system is a `plugin.json` away:

- **MCP Servers** — first-class integration with any Model Context Protocol server.
- **Tools & Skills** — register custom tools, or pull community skills via [ClawHub](https://www.npmjs.com/package/clawhub).
- **Lifecycle Hooks** — intercept `PreToolUse`, `UserPromptSubmit`, and other critical lifecycle events.
- **Custom Memory** — plug in your own memory store provider.

---

## 🤝 Contributing

Thanks to everyone who has contributed code, feedback, and ideas. New contributors are warmly welcome — let's build the next-gen agent OS together.

Workflow: **Fork → feature branch → PR**.

---

## 💬 Community

- For bugs and feature requests, please open a [GitHub Issue](https://github.com/OpenBMB/PilotDeck/issues).
- Join our community channels:

<table width="100%">
<tr>
<td width="33%" align="center"><b>WeChat Community</b></td>
<td width="33%" align="center"><b>Feishu Community</b></td>
<td width="33%" align="center"><b>Discord Community</b></td>
</tr>
<tr>
<td align="center"><img src="assets/community/qr-wechat.png" width="200" alt="WeChat QR"/></td>
<td align="center"><img src="assets/community/qr-feishu.png" width="200" alt="Feishu QR"/></td>
<td align="center"><img src="assets/community/qr-discord.png" width="200" alt="Discord QR"/></td>
</tr>
</table>

---

## 🙏 Acknowledgements

We thank Agent OS pioneers such as OpenClaw, Claude Code, Codex, Cursor, and Hermes for their explorations that helped shape this field.

PilotDeck builds upon the following outstanding open-source projects:

- [ClawXRouter](https://github.com/OpenBMB/ClawXRouter) — Intelligent model routing
- [ClawXMemory](https://github.com/OpenBMB/ClawXMemory) — Agent memory system
- [Claude Code UI](https://github.com/siteboon/claudecodeui) — Web UI reference
- [Claude Code Router](https://github.com/musistudio/claude-code-router) — Model routing reference
- [UltraRAG](https://github.com/OpenBMB/UltraRAG) — RAG framework
- [Anthropic Skills](https://github.com/anthropics/skills) — Agent skill framework and built-in skills (skill-creator)
- [Vercel Labs Skills](https://github.com/vercel-labs/skills) — find-skills skill
- [MiniMax-AI Skills](https://github.com/MiniMax-AI/skills) — minimax-pdf skill
- [frontend-slides](https://github.com/zarazhangrui/frontend-slides) — Create beautiful slides on the web using a coding agent's frontend skills
- [Karpathy Guidelines](https://x.com/karpathy/status/2015883857489522876) — LLM coding behavioral guidelines
- [Vite](https://github.com/vitejs/vite) — Frontend build tool
- [React](https://github.com/facebook/react) — UI framework
- [Tailwind CSS](https://github.com/tailwindlabs/tailwindcss) — Utility-first CSS framework
- [shadcn/ui](https://github.com/shadcn-ui/ui) — Accessible component primitives for React

---

## 🏢 Joint Development

PilotDeck is jointly developed by Tsinghua University [THUNLP](https://nlp.csai.tsinghua.edu.cn/), [ModelBest](https://modelbest.cn/), [OpenBMB](https://www.openbmb.cn/) and [AI9Stars](https://github.com/AI9Stars).

---

## ⭐ Support Us

If PilotDeck has been helpful in your work or research, please consider giving us a Star on GitHub!

---

## 📝 Citation

```bibtex
@misc{pilotdeck2026,
  author       = {PilotDeck Team},
  title        = {PilotDeck: A WorkSpace-Centric Open-Source Agent Operating System},
  howpublished = {\url{https://github.com/OpenBMB/PilotDeck}},
  year         = {2026},
  note         = {Accessed: 2026-05-29}
}
```

## 📄 License

This project is licensed under the [GNU Affero General Public License v3.0](LICENSE).