# SCAN: FastChat
date: 2026-02-12 | program: FastChat | repo: https://github.com/lm-sys/FastChat | bounty: $500-$1500

## summary
raw_findings: 14 (manual review) | real: 3 | high_conf: 1 | med_conf: 2 | low_conf: 3 | false_pos: 8

NOTE: semgrep/trufflehog could not be run (Bash permissions denied). All findings below come from thorough manual code review of security-critical files.

## findings

### F1: SSRF via Worker Registration - Controller accepts arbitrary worker URLs
severity: high | confidence: high | type: SSRF | cwe: CWE-918
file: /Users/sebas/Code/bug-bounty/data/repos/FastChat/fastchat/serve/controller.py:288 | tool: manual

```python
# controller.py:288-296
@app.post("/register_worker")
async def register_worker(request: Request):
    data = await request.json()
    controller.register_worker(
        data["worker_name"],      # <-- attacker-controlled URL
        data["check_heart_beat"],
        data.get("worker_status", None),
        data.get("multimodal", False),
    )

# controller.py:104-115 - Controller makes requests to the attacker-controlled URL
def get_worker_status(self, worker_name: str):
    try:
        r = requests.post(worker_name + "/worker_get_status", timeout=5)
        ...
        return r.json()

# controller.py:266-282 - Proxies user requests to the attacker-controlled URL
def worker_api_generate_stream(self, params):
    worker_addr = self.get_worker_address(params["model"])
    ...
    response = requests.post(
        worker_addr + "/worker_generate_stream",
        json=params,
        stream=True,
        timeout=WORKER_API_TIMEOUT,
    )
```

analysis: The controller's `/register_worker` endpoint accepts any URL as `worker_name` with NO authentication, NO URL validation, and NO allowlisting. An attacker can register `http://169.254.169.254/latest/meta-data/` or any internal service URL as a "worker". The controller then makes HTTP POST requests to that URL during health checks (`get_worker_status`), when proxying generation requests (`worker_api_generate_stream`), and via the heartbeat system. This is a textbook SSRF that allows probing internal network, cloud metadata services, and exfiltrating internal data via the JSON response. The OpenAI API server (`openai_api_server.py:376`) also forwards requests to whatever worker address the controller returns, creating a second SSRF vector where the response content is returned to the caller.

attack_vector: `POST /register_worker` with body `{"worker_name": "http://169.254.169.254/latest/meta-data/iam/security-credentials/", "check_heart_beat": false, "worker_status": {"model_names": ["evil-model"], "speed": 1, "queue_length": 0}}`. Then `POST /get_worker_address` with `{"model": "evil-model"}` returns the malicious URL. Any subsequent request to the OpenAI API with `model=evil-model` triggers an HTTP request to the internal URL, and the response is returned to the attacker.

impact: Full SSRF. Read cloud metadata (AWS/GCP credentials), scan internal network, access internal services. On cloud deployments, this can lead to full account compromise via stolen IAM credentials.

recommendation: REPORT

---

### F2: SSRF via Image URL Fetching in load_image utility
severity: medium | confidence: medium | type: SSRF | cwe: CWE-918
file: /Users/sebas/Code/bug-bounty/data/repos/FastChat/fastchat/utils.py:394 | tool: manual

```python
# utils.py:394-412
def load_image(image_file):
    from PIL import Image
    import requests

    image = None

    if image_file.startswith("http://") or image_file.startswith("https://"):
        timeout = int(os.getenv("REQUEST_TIMEOUT", "3"))
        response = requests.get(image_file, timeout=timeout)  # <-- no URL validation
        image = Image.open(BytesIO(response.content))
    elif image_file.lower().endswith(("png", "jpg", "jpeg", "webp", "gif")):
        image = Image.open(image_file)  # <-- potential path traversal
    ...
```

analysis: `load_image` is imported in `gradio_web_server.py:45` and used by vision-related endpoints. The SGLang worker (`sglang_worker.py:125`) calls `load_image(images[i])` where `images` comes from user input via the OpenAI API (`openai_api_server.py:309`). User provides `image_url` in chat completion requests, which flows through the OpenAI API server -> controller -> worker -> `load_image`. No URL validation is performed - internal URLs like `http://169.254.169.254` or `file:///etc/passwd` patterns could be used. However, the exact reachability depends on the worker type (sglang_worker specifically calls it).

attack_vector: Send a chat completion request to `/v1/chat/completions` with messages containing `{"type": "image_url", "image_url": {"url": "http://169.254.169.254/latest/meta-data/"}}`. On sglang_worker deployments, this triggers an HTTP GET to the internal URL.

impact: SSRF via image fetching. The response content is processed as an image (PIL), so data exfiltration is limited. But the request itself reveals network reachability and can be used for port scanning.

recommendation: INVESTIGATE -- depends on sglang worker deployment reachability. The image_url flow from OpenAI API to worker needs confirmation.

---

### F3: Unauthenticated Controller API Endpoints Enable Worker Hijacking
severity: high | confidence: medium | type: Auth Bypass | cwe: CWE-306
file: /Users/sebas/Code/bug-bounty/data/repos/FastChat/fastchat/serve/controller.py:288-348 | tool: manual

```python
# ALL controller endpoints have NO authentication:
@app.post("/register_worker")      # Register arbitrary workers
@app.post("/refresh_all_workers")  # Force refresh all workers
@app.post("/list_models")          # Enumerate models
@app.post("/get_worker_address")   # Get worker addresses
@app.post("/receive_heart_beat")   # Spoof heartbeats
@app.post("/worker_generate_stream")  # Proxy generation requests
@app.post("/worker_get_status")    # Get worker status
@app.get("/test_connection")       # Test connection
```

analysis: The controller has zero authentication on all endpoints. While the OpenAI API server has optional API key checking (`check_api_key`), the controller itself is a separate service (default port 21001) that is often exposed. An attacker who can reach the controller can: (1) register malicious workers to intercept/poison model responses, (2) enumerate all model names and worker addresses, (3) send fake heartbeats to keep malicious workers alive, (4) deregister legitimate workers by not sending heartbeats. Combined with F1, this creates a full model supply chain attack: register a malicious worker that returns poisoned responses.

attack_vector: `POST http://target:21001/register_worker` with a malicious worker URL that returns crafted model responses. All subsequent user requests for that model get poisoned responses.

impact: Complete model response manipulation (prompt injection at infrastructure level), internal network information disclosure (worker addresses), denial of service (deregister workers).

recommendation: REPORT -- note this is architecturally expected for internal-only deployments, but when exposed (which is common given default host="localhost" but users frequently override to "0.0.0.0"), the impact is severe.

---

### F4: Unsafe deserialization in summarize_cluster.py
severity: medium | confidence: low | type: Deserialization | cwe: CWE-502
file: /Users/sebas/Code/bug-bounty/data/repos/FastChat/fastchat/serve/monitor/summarize_cluster.py:33 | tool: manual

```python
cluster_infos = pickle.load(open(args.input_file, "rb"))
```

analysis: Uses unsafe deserialization on a file provided as a CLI argument. If an attacker can control the input file (e.g., via a shared filesystem or supply chain attack), they achieve RCE. However, this is a CLI-only script for offline data analysis, not a web-exposed endpoint.

attack_vector: Supply a malicious serialized file as `--input-file` argument.

impact: Arbitrary code execution if file is attacker-controlled.

recommendation: SKIP -- CLI-only script, not reachable from network. Low duplicate risk.

---

### F5: Unsafe deserialization in monitor.py (Gradio leaderboard)
severity: medium | confidence: low | type: Deserialization | cwe: CWE-502
file: /Users/sebas/Code/bug-bounty/data/repos/FastChat/fastchat/serve/monitor/monitor.py:929 | tool: manual

```python
with open(elo_results_file, "rb") as fin:
    elo_results = pickle.load(fin)
```

analysis: The monitor Gradio app loads elo results from a file at startup. The file path is from CLI args. If an attacker can write to this file on the server's filesystem, they get RCE. This runs in the Gradio web context, making it slightly more interesting if the file is on a shared/writable filesystem.

attack_vector: Replace the elo_results file on disk with a malicious file.

impact: RCE on the Gradio server process.

recommendation: SKIP -- requires filesystem write access, which implies already-compromised system.

---

### F6: subprocess with shell=True in launch_all_serve.py
severity: medium | confidence: low | type: Command Injection | cwe: CWE-78
file: /Users/sebas/Code/bug-bounty/data/repos/FastChat/fastchat/serve/launch_all_serve.py:248-253 | tool: manual

```python
# CLI arg --model-path-address formatted as model-path@host@port
args.model_path, args.worker_host, args.worker_port = item.split("@")
worker_str_args = string_args(args, worker_args)  # builds shell string
worker_sh = base_launch_sh.format(
    "model_worker", worker_str_args, LOGDIR, f"worker_{log_name}"
)
subprocess.run(worker_sh, shell=True, check=True)
```

analysis: CLI arguments are interpolated into shell commands via `shell=True`. A malicious `--model-path-address` value containing shell metacharacters could achieve command injection. However, this is a CLI-only launcher script invoked by an admin, not network-exposed.

attack_vector: `python launch_all_serve.py --model-path-address "evil;id@localhost@2022"`

impact: Arbitrary command execution in CLI context.

recommendation: SKIP -- CLI-only, not network reachable.

---

## skipped (false_positive / low-value)

| file:line | pattern | reason |
|---|---|---|
| fastchat/model/apply_delta.py:37 | torch.load | CLI utility for offline model delta application. Not network-reachable. |
| fastchat/model/compression.py:189 | torch.load | CLI utility for model compression. Not network-reachable. |
| fastchat/utils.py:221 | torch.load | CLI utility clean_flant5_ckpt. Not network-reachable. |
| fastchat/llm_judge/common.py:181 | ast.literal_eval | Parses LLM judge output ratings. Not user-controlled, ast.literal_eval is safe for literals. |
| fastchat/serve/monitor/monitor.py:158 | ast.literal_eval | Parses leaderboard data from CSV. Server-controlled input. |
| fastchat/serve/dashinfer_worker.py:307 | subprocess.run shell=True | Hardcoded pip command, no user input flows into it. |
| fastchat/serve/shutdown_serve.py:23 | subprocess.run shell=True | CLI tool with argparse choices constraint. Input limited to fixed set of values. |
| fastchat/serve/monitor/classify/label.py:32 | yaml.load | Uses yaml.SafeLoader -- safe. |

## analysis_notes

### Architecture Overview
FastChat uses a 3-tier architecture: OpenAI API Server -> Controller -> Workers. The controller is the critical trust boundary. Workers register with the controller providing their URL, and the controller proxies requests to workers and returns their responses.

### Key Vulnerability Pattern: Trust Chain
The core vulnerability is that the Controller trusts any entity that can reach its API. Since the controller has no authentication, anyone who can send HTTP requests to it can register workers, hijack model responses, and trigger SSRF. This is the most impactful finding.

### Duplicate Risk Assessment
- F1 (SSRF via register_worker): MODERATE duplicate risk. This is an obvious architectural flaw that a security reviewer would find. However, the FastChat project is primarily an ML tool and may not have had formal security review.
- F3 (Unauthenticated controller): HIGH duplicate risk. Very obvious, may be considered "by design" for local development.
- F2 (Image SSRF): LOW duplicate risk. Requires understanding of the sglang worker path.

### Scope Notes
FastChat on huntr typically covers the full repository. The controller and API server are the main production-facing attack surface. Monitor/leaderboard scripts are lower priority.