fix(http-toolset): reject path-traversal segments in endpoint whitelist (CWE-918/CWE-22) by sebastiondev · Pull Request #2007 · HolmesGPT/holmesgpt

sebastiondev · 2026-05-06T14:43:12Z

Summary

The HTTP toolset's endpoint whitelist (HttpToolset.match_endpoint in holmes/plugins/toolsets/http/http_toolset.py) matches the URL path with fnmatch against the configured paths patterns before the URL is normalised. However, urllib3/requests resolve .. and . segments while preparing the request, so a URL that passes the whitelist can hit a completely different path on the wire.

This means an attacker who can influence the LLM (e.g. via prompt-injection content in alerts, issue descriptions, or logs that HolmesGPT investigates) can cause the toolset to issue requests against paths on the configured host that the operator never intended to expose — for example reaching /admin/... on a host that was only whitelisted for /wiki/rest/api/*.

CWE: CWE-918 (SSRF) / CWE-22 (Path Traversal in validator)
File: holmes/plugins/toolsets/http/http_toolset.py — HttpToolset.match_endpoint
Severity: Medium — requires the HTTP toolset to be configured and a prompt-injection vector, but bypasses an explicit security control (the path whitelist).

Data flow

Operator configures an HTTP toolset endpoint, e.g. host confluence.internal, paths ["/wiki/rest/api/*"].
LLM (influenced by attacker-controlled investigated content) calls the HTTP tool with https://confluence.internal/wiki/rest/api/../../admin/secret.
match_endpoint runs fnmatch("/wiki/rest/api/../../admin/secret", "/wiki/rest/api/*") → True, so the request is approved.
requests.Request(...).prepare().url normalises the path → the actual request goes to https://confluence.internal/admin/secret.
The whitelist is bypassed and an unintended path is fetched (and its response returned to the LLM/user).

We verified both behaviours locally:

>>> import fnmatch
>>> fnmatch.fnmatch("/wiki/rest/api/../../admin/secret", "/wiki/rest/api/*")
True
>>> import requests
>>> requests.Request("GET", "http://example.com/api/foo/../../admin").prepare().url
'http://example.com/admin'

Fix

Minimal, defensive changes in match_endpoint:

Restrict accepted schemes to http / https (rejects file://, gopher://, etc.).
Reject any path containing . or .. segments — checked on both the raw path and its percent-decoded form, so %2e%2e cannot bypass the check.
Normalise the path with posixpath.normpath before passing it to _match_path, so duplicate slashes and other benign noise behave consistently.

The fix is intentionally narrow: it only tightens the validator and does not change endpoint configuration semantics or the request pipeline. Requests to legitimate paths continue to work unchanged.

Tests

Added four focused tests in tests/plugins/toolsets/http/test_http_toolset.py:

test_path_traversal_dotdot_rejected — literal .. segment is rejected.
test_path_traversal_encoded_rejected — %2e%2e is rejected.
test_path_single_dot_segment_rejected — /./ segment is rejected.
test_non_http_scheme_rejected — file:///etc/passwd is rejected.

Full test file passes:

tests/plugins/toolsets/http/test_http_toolset.py ... 86 passed

Security analysis

The exploit preconditions are realistic for HolmesGPT's deployment model:

The HTTP toolset is configured to talk to an internal service whose host also serves more sensitive paths than the whitelisted prefix (Confluence/Jira/Grafana style apps with admin endpoints under the same host are common).
The investigated data (alert annotations, issue text, log lines) can influence what the LLM passes as url to the HTTP tool. Prompt-injection through investigated content is part of HolmesGPT's documented threat surface.

After the fix, an attacker controlling the LLM-supplied URL can no longer escape the configured prefix on a whitelisted host, because any ../. segment (raw or percent-encoded) is rejected up front and the path is normalised before fnmatch.

Adversarial review

Before submitting we tried to disprove this finding. We considered whether requests/urllib3 might preserve .. on the wire (it does not — it normalises), whether the existing _match_path already handled traversal (it uses plain fnmatch, which treats .. as ordinary characters), and whether the preconditions already grant equivalent access (they don't — the attacker's prompt-injection capability does not, on its own, let them reach off-prefix paths on whitelisted internal hosts; that ability is exactly what this bug provides). We also checked that the fix does not over-block — paths without traversal segments continue to match as before, and the existing host/path tests still pass.

cc @lewiswigmore

Summary by CodeRabbit

Bug Fixes
- Strengthened URL validation and security for HTTP endpoints: restricts to HTTP/HTTPS, ensures host presence, prevents path traversal (including encoded variants), and normalizes paths while preserving trailing slashes for accurate matching.
Tests
- Added tests covering path traversal cases, encoded traversal, single-dot segments, and rejection of non-HTTP schemes to validate endpoint matching.

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

linux-foundation-easycla · 2026-05-06T14:43:22Z

❌ - login: @sebastiondev / name: Sebastion . The commit (8dc3138) is not authorized under a signed CLA. Please click here to be authorized. For further assistance with EasyCLA, please visit our EasyCLA portal and chat with our support bot.

netlify · 2026-05-06T14:43:33Z

❌ Deploy Preview for holmes-docs failed. Why did it fail? →

Name	Link
🔨 Latest commit	`8dc3138`
🔍 Latest deploy log	https://app.netlify.com/projects/holmes-docs/deploys/69fb66eadb155d0007df55ca

coderabbitai · 2026-05-06T14:45:25Z

Walkthrough

Stricter URL validation added to HttpToolset.match_endpoint: only HTTP/HTTPS with netloc allowed; path traversal rejected for '.' and '..' segments (raw and percent-decoded); paths normalized with trailing-slash preservation. New tests cover traversal and non-HTTP scheme rejection.

Changes

Enhanced URL Validation and Path Traversal Protection

Layer / File(s)	Summary
Imports `holmes/plugins/toolsets/http/http_toolset.py`	Added `posixpath` and `unquote` to support path normalization and percent-decoding.
URL Scheme & Netloc Validation `holmes/plugins/toolsets/http/http_toolset.py`	`match_endpoint` enforces `http`/`https` schemes and requires a non-empty netloc.
Path Traversal Checks `holmes/plugins/toolsets/http/http_toolset.py`	Rejects paths containing `.` or `..` segments, checking both raw and percent-decoded path components and returning a rejection message when detected.
Path Normalization `holmes/plugins/toolsets/http/http_toolset.py`	Normalizes request paths with `posixpath.normpath`, preserves trailing slashes when appropriate, and uses the normalized path for endpoint matching.
Tests `tests/plugins/toolsets/http/test_http_toolset.py`	Added tests for dot-dot traversal, encoded traversal, single-dot traversal, and non-HTTP scheme rejection in host/endpoint matching.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main security fix: rejecting path-traversal segments in the HTTP endpoint whitelist with direct reference to the relevant CWE vulnerabilities (CWE-918/CWE-22).
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…2/CWE-918) The HTTP toolset whitelist matched the raw URL path with fnmatch, but requests/urllib3 normalise '..' segments before issuing the request. An LLM-supplied URL such as https://confluence.example.com/wiki/rest/api/../../admin/secret would match the configured '/wiki/rest/api/*' allow-list yet actually hit '/admin/secret' on the wire. Combined with prompt-injection via untrusted issue/alert data, this lets an attacker pivot the agent to arbitrary endpoints on a whitelisted host. This change: - Rejects URL paths whose raw or percent-decoded form contains '..' or '.' segments. - Normalises the path before glob matching so duplicate slashes and trailing slashes still match as before. - Restricts accepted URL schemes to http/https (was previously any scheme with a netloc). Added regression tests covering literal '..', percent-encoded '%2e%2e', single-dot segments, and non-http schemes. Signed-off-by: Sebastion <sebastion@sebastion.dev>

sebastiondev · 2026-05-06T16:06:11Z

Thanks for the automated checks. Quick status update:

DCO: Force-pushed an amended commit with Signed-off-by trailer to fix the DCO check.
EasyCLA: I'll complete the EasyCLA flow for @sebastiondev via the linked sign-up link so the CLA bot can re-evaluate. Will follow up here once that's done.
Netlify docs preview: The failing checks (holmes-docs deploy preview / Header / Redirect / Pages rules) are for the docs site build. This PR only touches holmes/plugins/toolsets/http/http_toolset.py and its test; no markdown/docs files. The deploy preview failure does not appear to be caused by this change. Happy to look closer if a maintainer can point me at the deploy log error — the Netlify log is gated behind the team's account.
License Compliance: 18 issues are also reported, but these aren't visibly tied to the diff in this PR (no dependency changes). Same offer — if a maintainer can share the report, I'll address anything attributable to this change.

@coderabbitai marked no actionable comments. Ready for human review whenever convenient.

coderabbitai · 2026-05-06T16:06:35Z

Tip

For best results, initiate chat on the files or code changes.

@sebastiondev Thanks for the detailed status update — that all sounds reasonable:

DCO: The amended commit with the Signed-off-by trailer should satisfy the check once the bot re-evaluates.
EasyCLA: Completing the CLA flow for your account should unblock that check.
Netlify docs preview: Your assessment looks correct — no documentation or markdown files were modified in this PR, so the deploy preview failure is almost certainly a pre-existing or unrelated issue. Maintainers can verify by checking the Netlify build log against the main branch baseline.
License Compliance: With no dependency changes in the diff, it's very likely those 18 issues are pre-existing and not introduced by this PR. A maintainer can confirm by running the compliance scan against main to compare.

The security fix itself (path-traversal/SSRF hardening in match_endpoint) looks well-scoped and the accompanying tests cover the key attack vectors (raw .., percent-encoded %2e%2e, single ., and non-HTTP schemes). The PR is in good shape for a maintainer pass. 🐰

coderabbitai

🧹 Nitpick comments (1)

tests/plugins/toolsets/http/test_http_toolset.py (1)

184-189: ⚡ Quick win

Make the non-HTTP scheme assertion specific to the expected failure path.

Right now this can pass on any error; asserting the scheme-rejection message makes the test more robust.

Suggested test tightening

     endpoint, error = toolset.match_endpoint("file:///etc/passwd")
     assert endpoint is None
     assert error is not None
+    assert "Unsupported URL scheme" in error

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/plugins/toolsets/http/test_http_toolset.py` around lines 184 - 189, The
test test_non_http_scheme_rejected currently only checks that an error occurred;
tighten it to assert the specific scheme-rejection path by inspecting the
returned error from toolset.match_endpoint("file:///etc/passwd") and verifying
it contains the expected scheme-related message (e.g. includes "file" or an
"unsupported scheme"/"non-http" substring). Update the assertions in
test_non_http_scheme_rejected to assert endpoint is None and that error is not
None and that str(error) (or error.message) contains the scheme-rejection text
produced by match_endpoint to ensure the failure is due to scheme rejection.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/plugins/toolsets/http/test_http_toolset.py`:
- Around line 184-189: The test test_non_http_scheme_rejected currently only
checks that an error occurred; tighten it to assert the specific
scheme-rejection path by inspecting the returned error from
toolset.match_endpoint("file:///etc/passwd") and verifying it contains the
expected scheme-related message (e.g. includes "file" or an "unsupported
scheme"/"non-http" substring). Update the assertions in
test_non_http_scheme_rejected to assert endpoint is None and that error is not
None and that str(error) (or error.message) contains the scheme-rejection text
produced by match_endpoint to ensure the failure is due to scheme rejection.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 11a8ea01-db42-4928-8f53-cbf071e37545

📥 Commits

Reviewing files that changed from the base of the PR and between 6bf7acf and 8dc3138.

📒 Files selected for processing (2)

holmes/plugins/toolsets/http/http_toolset.py
tests/plugins/toolsets/http/test_http_toolset.py

claude Bot reviewed May 6, 2026

View reviewed changes

sebastiondev force-pushed the fix/cwe918-confluence-ssrf-e06a branch from 6bf7acf to 8dc3138 Compare May 6, 2026 16:06

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(http-toolset): reject path-traversal segments in endpoint whitelist (CWE-918/CWE-22)#2007

fix(http-toolset): reject path-traversal segments in endpoint whitelist (CWE-918/CWE-22)#2007
sebastiondev wants to merge 1 commit intoHolmesGPT:masterfrom
sebastiondev:fix/cwe918-confluence-ssrf-e06a

sebastiondev commented May 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

claude Bot left a comment

Uh oh!

linux-foundation-easycla Bot commented May 6, 2026 •

edited

Loading

Uh oh!

netlify Bot commented May 6, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 6, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

sebastiondev commented May 6, 2026

Uh oh!

coderabbitai Bot commented May 6, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sebastiondev commented May 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Data flow

Fix

Tests

Security analysis

Adversarial review

Summary by CodeRabbit

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

linux-foundation-easycla Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Deploy Preview for holmes-docs failed. Why did it fail? →

Uh oh!

coderabbitai Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

sebastiondev commented May 6, 2026

Uh oh!

coderabbitai Bot commented May 6, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sebastiondev commented May 6, 2026 •

edited by coderabbitai Bot

Loading

linux-foundation-easycla Bot commented May 6, 2026 •

edited

Loading

netlify Bot commented May 6, 2026 •

edited

Loading

coderabbitai Bot commented May 6, 2026 •

edited

Loading