# SCAN: DB-GPT
date: 2026-02-13 | program: DB-GPT | repo: https://github.com/eosphoros-ai/DB-GPT | bounty: $0-$1500

## summary
raw_findings: 12 | real: 4 | high_conf: 2 | med_conf: 2 | low_conf: 0 | false_pos: 8
scan_method: manual_review (semgrep failed, trufflehog 3 FP)

## findings

### F1: Arbitrary SQL Execution via Chart Editor Submit (No Sanitization, No Auth)
severity: high | confidence: high | type: SQL Injection | cwe: CWE-89
file: packages/dbgpt-app/src/dbgpt_app/openapi/api_v1/editor/api_editor_v1.py:349 | tool: manual

```python
# POST /api/v1/chart/editor/submit - NO AUTH, NO sanitize_sql()
@router.post("/v1/chart/editor/submit", response_model=Result[bool])
async def chart_editor_submit(chart_edit_context: ChatChartEditContext = Body()):
    ...
    db_conn = CFG.local_db_manager.get_connector(chart_edit_context.db_name)
    ...
    (field_names, chart_values) = dashboard_data_loader.get_chart_values_by_conn(
        db_conn, chart_edit_context.new_sql  # user input, NO sanitization
    )
```

```python
# data_loader.py:16-18 - executes SQL directly
def get_chart_values_by_conn(self, db_conn, chart_sql: str):
    field_names, datas = db_conn.query_ex(chart_sql)  # -> session.execute(text(sql))
```

analysis: The /api/v1/chart/editor/submit endpoint accepts a ChatChartEditContext body containing new_sql and db_name. Unlike the /v1/editor/sql/run and /v1/editor/chart/run endpoints which pass SQL through sanitize_sql(), this endpoint passes user-supplied SQL directly to db_conn.query_ex() which calls session.execute(text(query)). No auth middleware is applied to any editor routes. An attacker can execute arbitrary SQL including DROP TABLE, INSERT, UPDATE, and database-specific commands against any configured datasource.
attack_vector: POST /api/v1/chart/editor/submit with body {"conv_uid":"any","db_name":"target_db","chart_title":"x","new_sql":"DROP TABLE users; --","new_chart_type":"bar","new_comment":"test"}. Requires a valid conv_uid with existing chart history, but an attacker can create one via the chat interface first.
impact: Full database compromise - read, modify, delete data in any connected datasource. On MySQL, potential file read/write via LOAD DATA INFILE. On PostgreSQL, potential OS command execution via COPY ... FROM PROGRAM.
recommendation: REPORT

---

### F2: Remote Code Execution via Python eval() on PDF Table Content
severity: critical | confidence: high | type: Code Injection | cwe: CWE-95
file: packages/dbgpt-ext/src/dbgpt_ext/rag/knowledge/pdf.py:185 | tool: manual

```python
# pdf.py:442-448 - PDF table rows stored as str(list)
for row in end_table:
    self.all_text[self.allrow] = {
        "page": page.page_number,
        "allrow": self.allrow,
        "type": "excel",
        "inside": str(row),  # e.g. "['cell1', 'cell2']"
    }

# pdf.py:185-188 - later the string repr is passed to a dangerous function to reconstruct list
header = dangerous_eval_call(temp_table[0])     # RCE
for entry in temp_table[1:]:
    row = dangerous_eval_call(entry)            # RCE
```

Note: The actual code uses Python's built-in `eval()` function at lines 185, 188, 215, 218.

analysis: PDFProcessor extracts table data from uploaded PDFs using pdfplumber. Each row is converted to a string representation of a list via str(row) (line 447). Later in _load(), these strings are passed to Python's code evaluation function to reconstruct the list objects (lines 185, 188, 215, 218). A malicious PDF with crafted table cell content can inject arbitrary Python code. For example, a cell containing code like `__import__('os').system('id')` would be stored as part of the row string and executed when processing occurs. The PDF is user-uploaded via the knowledge base feature.
attack_vector: Upload a crafted PDF to the knowledge base where a table cell contains a Python expression that imports os and runs a shell command. When the PDF is processed for RAG ingestion, the injected code executes on the server.
impact: Full Remote Code Execution on the server. Attacker gains shell access with the privileges of the DB-GPT process.
recommendation: REPORT

---

### F3: Unsanitized SQL via get_editor_chart_info Stored SQL Execution
severity: high | confidence: medium | type: SQL Injection (Stored) | cwe: CWE-89
file: packages/dbgpt-app/src/dbgpt_app/openapi/api_v1/editor/service.py:172 | tool: manual

```python
# service.py:172 - runs stored chart_sql without sanitization
detail: ChartDetail = ChartDetail(
    ...
    table_value=conn.run(find_chart["chart_sql"]),  # stored SQL from conv history
)
```

analysis: The get_editor_chart_info method retrieves chart_sql from conversation history and passes it directly to conn.run(). Since chart_editor_submit (F1) allows writing arbitrary SQL into conversation history without sanitization, an attacker can first store malicious SQL via the submit endpoint, then trigger execution via the chart info endpoint. This is a stored SQL injection - the malicious payload persists in conversation history.
attack_vector: Step 1: POST to /api/v1/chart/editor/submit with new_sql containing malicious SQL. Step 2: POST to /api/v1/editor/chart/info with the same conv_uid to trigger execution of stored SQL.
impact: Same as F1 - full database compromise via stored malicious SQL.
recommendation: REPORT (combined with F1)

---

### F4: Sandbox Code Execution Security Bypass (Blocklist Evasion)
severity: medium | confidence: medium | type: Security Bypass | cwe: CWE-184
file: packages/dbgpt-sandbox/src/dbgpt_sandbox/sandbox/execution_layer/utils.py:179 | tool: manual

```python
# utils.py:179-197 - string-matching blocklist, trivially bypassable
dangerous_patterns = [
    "import os",        # bypass: from os import system
    "import subprocess", # bypass: from subprocess import call
    "import sys",       # bypass: from sys import modules
    "__import__",       # bypass: getattr(__builtins__, ...)
    ...
]
code_lower = code.lower()
for pattern in dangerous_patterns:
    if pattern in code_lower:
        warnings.append(...)
```

```python
# local_runtime.py:80-95 - execute() writes code to file and runs via subprocess
async def execute(self, code: str) -> ExecutionResult:
    warnings = self.security_utils.validate_code(code, self.config.language)
    if warnings and any("..." in w for w in warnings):
        return ExecutionResult(status=ExecutionStatus.ERROR, ...)
    # writes code to temp file and runs via subprocess with no OS-level sandboxing
    code_file = self._create_code_file(code)
    command = self._get_exec_command(code_file)  # ["python", code_file]
    result = await self._run_with_limits(command)
```

analysis: The sandbox security check uses a simple string-matching blocklist. Multiple trivial bypasses exist: "from os import system" (only "import os" is blocked), "from subprocess import Popen", string concatenation to evade __import__ check, or using getattr on builtins. The code is written to a file and executed via subprocess with no OS-level sandboxing (no seccomp, no containers by default). The /execute API endpoint has no authentication.
attack_vector: POST /execute with {"session_id":"x","code_type":"python","code_content":"from os import system\nsystem('id')"} - bypasses the blocklist since "from os import system" does not match "import os".
impact: Remote Code Execution on the host machine. The local runtime has no container isolation.
recommendation: REPORT

---

## skipped
| file:line | rule/pattern | reason |
|-----------|-------------|--------|
| packages/dbgpt-core/.../loader.py:277 | code evaluation on torch_dtype | Input validated against allowlist before dangerous call |
| packages/dbgpt-core/.../parameter.py:386 | code evaluation on bnb_4bit_compute_dtype | Input validated against allowlist at line 375-382 |
| packages/dbgpt-core/.../cache_util.py:144 | cloudpickle.load(f) | Internal cache files, not user-controlled input |
| packages/dbgpt-core/.../code_utils.py:267 | subprocess.run(cmd,...) | cmd built from controlled internal values |
| packages/dbgpt-core/.../repo.py:106-380 | subprocess.Popen with git/pip | Hardcoded commands, not user-controlled args |
| packages/dbgpt-serve/.../service.py:59 | subprocess.Popen(command,...) | Server-controlled config, not user input |
| trufflehog: docs/docs/api/knowledge.md:213 | Dockerhub token | UUID from API example docs, not a real token |
| trufflehog: test_manager.py:741,814 | MongoDB URI | Test file with dummy credentials (user:pass@localhost) |