# SCAN: BentoML
date: 2026-02-12 | program: BentoML | repo: https://github.com/bentoml/BentoML | bounty: $500-$1500

## summary
raw_findings: 14 | real: 5 | high_conf: 1 | med_conf: 3 | low_conf: 1 | false_pos: 9

Note: semgrep/trufflehog were unavailable due to sandbox restrictions. Manual code audit performed instead using targeted pattern searches across all focus areas (RCE, SSRF, path traversal, auth bypass, deserialization, container escape).

## findings

### F1: Unsafe deserialization on runner HTTP endpoint (runner_app.py)
severity: critical | confidence: medium | type: Deserialization / RCE | cwe: CWE-502
file: /Users/sebas/Code/bug-bounty/data/repos/BentoML/src/bentoml/_internal/server/runner_app.py:301 | tool: manual

```python
# runner_app.py lines 291-304
async def _request_handler(request: Request) -> Response:
    assert self._is_ready

    arg_num = int(request.headers["args-number"])
    r_: bytes = await request.body()

    if arg_num == 1:
        params: Params[t.Any] = _deserialize_single_param(request, r_)

    else:
        params: Params[t.Any] = pickle.loads(r_)  # UNSAFE DESERIALIZATION

    try:
        payload = await infer(params)
```

analysis: When `args-number` header is >1, the runner server directly deserializes the raw HTTP request body with zero validation. The runner server listens on localhost:PORT (Windows/WSL) or UDS. While not directly internet-exposed, an SSRF from the main service (see F2) or any adjacent service could reach this. In containerized environments (k8s), runner pods may be reachable from other pods in the same namespace. This is the legacy runner architecture -- the new SDK uses a serde system with `is_main` guard, but the old runner_app.py has no such guard.

attack_vector: Attacker needs network access to the runner service (localhost on same host, or pod-to-pod in k8s). Send POST to `/<method_name>` with header `args-number: 2` and a crafted payload as body. The payload executes arbitrary code during deserialization.

impact: Remote Code Execution on the runner worker process. Full compromise of the ML model serving environment.

recommendation: INVESTIGATE -- requires SSRF or network adjacency to exploit. High severity but medium confidence due to the network access requirement.

---

### F2: SSRF protection bypass via race condition in make_safe_connect
severity: high | confidence: medium | type: SSRF | cwe: CWE-918
file: /Users/sebas/Code/bug-bounty/data/repos/BentoML/src/bentoml/_internal/utils/uri.py:63-107 | tool: manual

```python
# uri.py lines 59-107
original_create_connection = None  # GLOBAL MUTABLE STATE

@contextlib.contextmanager
def make_safe_connect():
    global original_create_connection
    if original_create_connection is None:
        original_create_connection = Loop.create_connection

    @no_type_check
    async def safe_create_connection(
        self, protocol_factory, host=None, port=None, **kwargs
    ):
        if host is not None and (host, port) not in proxies:
            try:
                ip = ipaddress.ip_address(host)
            except ValueError:
                raise socket.gaierror(f"Blocked invalid IP address {host}")
            else:
                if ip.is_private or ip.is_loopback or ip.is_link_local:
                    raise socket.gaierror(f"Blocked private IP address {host}")
        return await original_create_connection(...)

    Loop.create_connection = safe_create_connection
    try:
        yield
    except httpx.ConnectError as e:
        if "All connection attempts failed" in str(e):
            raise BadInput("Connection blocked due to insecure input URL") from e
    finally:
        Loop.create_connection = original_create_connection  # RESTORES ORIGINAL
```

analysis: The SSRF protection uses a context manager that monkey-patches `uvloop.Loop.create_connection` globally, then restores it in the `finally` block. Under concurrent requests, Request A enters the context manager and patches the function. Request B (malicious) arrives. Request A exits and restores the original unpatched `create_connection`. Request B's connection now uses the unpatched version, bypassing SSRF protection entirely. This is a classic TOCTOU race. Additionally, the `except httpx.ConnectError` with string matching `"All connection attempts failed"` silently swallows other connection errors. The protection is also uvloop-specific -- it imports directly from `uvloop` and would fail on systems without it.

attack_vector: Send concurrent requests to an endpoint that uses `JSONSerde.parse_request` or `MultipartSerde.ensure_file` (any endpoint accepting file URLs or multipart file fields). One legitimate request, one with a private IP URL (e.g., `http://169.254.169.254/latest/meta-data/`). Time the malicious request to land in the window when the first request's context manager has restored the original `create_connection`.

impact: SSRF to internal services, cloud metadata endpoints (AWS/GCP/Azure IMDS), or the internal runner services (which accept unsafe deserialization -- see F1, creating a chain to RCE).

recommendation: REPORT -- race condition in SSRF protection is a real, reportable vulnerability.

---

### F3: Unsafe deserialization accessible on non-main (dependency) service endpoints
severity: high | confidence: medium | type: Deserialization / RCE | cwe: CWE-502
file: /Users/sebas/Code/bug-bounty/data/repos/BentoML/src/_bentoml_impl/server/app.py:774-778 | tool: manual

```python
# app.py lines 774-779
# NOTE: The following check is for security concern, DO NOT REMOVE
if self.is_main and media_type == "application/vnd.bentoml+pickle":
    raise BentoMLException(
        "application/vnd.bentoml+pickle is not allowed in main server",
        error_code=HTTPStatus.UNSUPPORTED_MEDIA_TYPE,
    )
```

```python
# dependency.py line 75
media_type = "application/vnd.bentoml+pickle"
```

```python
# serde.py lines 272-284 (PickleSerde.deserialize_value)
def deserialize_value(self, payload: Payload) -> t.Any:
    if "buffer-lengths" not in payload.metadata:
        return pickle.loads(b"".join(payload.data))
```

analysis: The deserialization block only applies when `self.is_main` is True. Dependency services (non-entry services) have `is_main=False` and accept the binary serde content type. These services listen on `127.0.0.1:PORT` (Windows/WSL) or UDS. An attacker who can reach the dependency service (via SSRF from the main service per F2, or network adjacency in k8s) can send a POST with the binary serde content type and a crafted payload to achieve RCE.

attack_vector: Chain with F2 (SSRF) to reach dependency service on localhost, then POST with appropriate content type header and crafted RCE payload.

impact: Remote Code Execution on dependency service worker.

recommendation: INVESTIGATE -- standalone impact requires SSRF or network adjacency, but combined with F2 creates full chain.

---

### F4: Path traversal in yatai.py tar extraction (no safe_extract_tarfile)
severity: medium | confidence: low | type: Path Traversal | cwe: CWE-22
file: /Users/sebas/Code/bug-bounty/data/repos/BentoML/src/bentoml/_internal/cloud/yatai.py:532-539 | tool: manual

```python
# yatai.py lines 529-539
tar = tarfile.open(fileobj=tar_file, mode="r")
with self.spinner.spin(text=f'Extracting bento "{_tag}" tar file'):
    with fs.open_fs("temp://") as temp_fs:
        for member in tar.getmembers():
            f = tar.extractfile(member)
            if f is None:
                continue
            p = Path(member.name)
            if p.parent != Path("."):
                temp_fs.makedirs(str(p.parent), recreate=True)
            temp_fs.writebytes(member.name, f.read())  # member.name not validated
```

analysis: The yatai.py code (legacy BentoCloud/Yatai client) extracts tar archives downloaded from a remote server. Unlike `cloud/bento.py` and `cloud/model.py` which use `safe_extract_tarfile()`, the yatai.py code writes directly using `temp_fs.writebytes(member.name, ...)` with no path traversal check on `member.name`. A malicious tar from a compromised Yatai server could contain entries like `../../etc/cron.d/malicious`. However, pyfilesystem2's TempFS may provide some sandboxing. Also, the extraction target is a temp directory. Same pattern on line 974 for model extraction.

attack_vector: Compromise the Yatai server or perform MITM on the download. Serve a tar with path traversal entries. User runs `bentoml pull` against the malicious Yatai server.

impact: Arbitrary file write outside temp directory (if pyfilesystem2 doesn't block traversal).

recommendation: SKIP -- requires compromised Yatai server (trusted source), pyfilesystem2 may block traversal, and newer cloud/bento.py already uses safe_extract_tarfile. Low exploitability in practice. Note: `cloud/bento.py:542` and `cloud/model.py:504` correctly use `safe_extract_tarfile`.

---

### F5: Multiple unsafe deserialization calls in runner container system
severity: high | confidence: low | type: Deserialization / RCE | cwe: CWE-502
file: /Users/sebas/Code/bug-bounty/data/repos/BentoML/src/bentoml/_internal/runner/container.py:312,416 | tool: manual

```python
# container.py line 312 (NumpyNdarrayContainer.from_payload)
if format == "default":
    return pickle.loads(payload.data)

# container.py line 416 (PandasDataFrameContainer.from_payload)
if payload.meta["format"] == "default":
    return pickle.loads(payload.data)
```

```python
# runner_handle/remote.py line 263
if content_type == "application/vnd.bentoml.multiple_outputs":
    payloads = pickle.loads(body)
```

analysis: Multiple unsafe deserialization calls in the runner data serialization pipeline. These handle data flowing between the main API server and runner processes. The data originates from the runner server responses or is sent to runner endpoints. In normal operation, this is internal-only traffic. However, if an attacker can perform SSRF to the runner (F2 chain) or if they control a runner's response (unlikely), these become exploitable. The `remote.py:263` case deserializes a response body from the runner server.

attack_vector: Requires control of runner server responses or SSRF to intercept/inject into the runner communication channel.

impact: RCE via unsafe deserialization.

recommendation: SKIP -- internal-only traffic between trusted components. Would require deep compromise to exploit standalone. Part of the broader attack surface but not independently reportable.

---

## skipped

| file:line | rule/pattern | reason |
|-----------|------|--------|
| src/bentoml/_internal/frameworks/picklable_model.py:73 | cloudpickle.load | Model loading from local model store. Not user-controlled at runtime. |
| src/bentoml/_internal/models/model.py:114 | cloudpickle.load | Custom objects loaded from local model store path. Server operator controlled. |
| src/bentoml/_internal/frameworks/transformers.py:430,474,481 | cloudpickle.load | Loads pipeline/protocol from model store. Not user-controlled at runtime. |
| src/bentoml/_internal/frameworks/easyocr.py:99 | cloudpickle.load | Model loading from local store. |
| src/bentoml/_internal/frameworks/pytorch.py:81 | torch.load | Model loading from local store. |
| src/bentoml/_internal/utils/pickle.py:70-72 | torch.load | Utility for torch tensor deserialization from internal pipeline data. |
| src/bentoml/_internal/utils/pickle.py:91 | pickle.loads | Fallback torch unpickler, internal pipeline data only. |
| src/_bentoml_impl/frameworks/sklearn.py:75 | joblib.load | Model loading from local store. |
| src/bentoml/_internal/container/generate.py | Jinja2 template rendering | Templates are loaded from server-controlled filesystem paths. `resolve_user_filepath` has path traversal protection (recently hardened in commit 84d08cf). User template paths validated. |