feat(marketplace): add source parity for marketplace add#1739
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds Phase 1 source-shape parity to apm marketplace add, aligning with Anthropic-compatible inputs by supporting HTTPS git URLs with #ref, local directory/file sources, and direct hosted marketplace.json URLs, while carrying hosted JSON provenance into install/lockfile flows.
Changes:
- Extend marketplace source modeling + fetching to support direct HTTPS
marketplace.jsonretrieval with digest + ETag/Last-Modified caching. - Propagate hosted JSON provenance (
source_url,source_digest) through resolver/install validation intoapm.lock.yaml. - Add archive extraction safety helpers + new/updated unit tests and docs to cover the expanded
marketplace addsurface.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/commands/marketplace/__init__.py |
Adds source-shape discrimination (#ref, local file, hosted marketplace.json), cache clearing by source, and UX warnings for unpinned git URLs. |
src/apm_cli/marketplace/client.py |
Implements direct HTTPS fetch path for hosted marketplace.json with digest + conditional refresh metadata and URL-aware cache keys. |
src/apm_cli/marketplace/models.py |
Introduces url kind, carries source_url/source_digest on manifests, and threads provenance through parse_marketplace_json. |
src/apm_cli/marketplace/resolver.py |
Carries manifest provenance into MarketplacePluginResolution and centralizes provenance payload building. |
src/apm_cli/commands/install.py |
Uses resolver-provided provenance payload during marketplace install validation. |
src/apm_cli/deps/lockfile.py |
Adds source_url/source_digest fields to LockedDependency serialization/deserialization. |
src/apm_cli/install/phases/lockfile.py |
Attaches marketplace provenance (including hosted JSON provenance) onto lockfile dependencies. |
src/apm_cli/marketplace/archive.py |
Adds safe archive download/extraction helpers with traversal/link/size guards. |
tests/unit/marketplace/test_marketplace_commands.py |
Adds CLI tests for #ref parsing, unpinned warnings, local file add, and hosted marketplace.json add. |
tests/unit/marketplace/test_marketplace_client.py |
Adds tests for hosted marketplace.json fetch digest capture and conditional headers. |
tests/unit/marketplace/test_client_local.py |
Adds coverage for local direct-file manifest reads. |
tests/unit/marketplace/test_lockfile_provenance.py |
Adds round-trip tests for new lockfile provenance fields and builder attachment behavior. |
tests/unit/marketplace/test_marketplace_install_integration.py |
Verifies provenance propagation from resolver into install validation outcomes. |
tests/unit/marketplace/test_marketplace_archive.py |
Adds safety tests for zip/tar extraction helpers. |
docs/src/content/docs/reference/cli/marketplace.md |
Updates CLI reference to document new accepted source shapes and #ref behavior. |
docs/src/content/docs/consumer/installing-from-marketplaces.md |
Documents new marketplace source shapes and hosted JSON lockfile provenance behavior. |
packages/apm-guide/.apm/skills/apm-usage/commands.md |
Updates the guide skill command reference for the expanded marketplace add SOURCE forms. |
CHANGELOG.md |
Adds an Unreleased entry describing the new marketplace add source shapes and provenance behavior. |
Copilot's findings
- Files reviewed: 18/18 changed files
- Comments generated: 5
| def _detect_archive_format(content_type: str, url: str) -> str: | ||
| """Return ``tar.gz`` or ``zip`` from Content-Type or URL extension.""" | ||
| media_type = content_type.lower().split(";", 1)[0].strip() | ||
| if media_type in {"application/gzip", "application/x-gzip", "application/x-tar"}: | ||
| return "tar.gz" | ||
| if media_type in {"application/zip", "application/x-zip-compressed"}: | ||
| return "zip" |
| final_url = getattr(response, "url", url) | ||
| if isinstance(final_url, str) and urlparse(final_url).scheme.lower() != "https": | ||
| raise ArchiveError(f"Redirect to non-HTTPS URL rejected: {final_url!r}") | ||
|
|
||
| Path(dest_dir).mkdir(parents=True, exist_ok=True) | ||
| archive_format = _detect_archive_format(response.headers.get("Content-Type", ""), url) | ||
| if archive_format == "tar.gz": |
| return source.lower().startswith("https://") and kind in {"github", "gitlab", "git"} | ||
|
|
||
|
|
||
| def _local_source_points_to_file(source) -> bool: |
| For generic-git marketplaces, `marketplace.json` is fetched via a | ||
| sparse-cone clone (only the manifest path is downloaded); APM does | ||
| not forward `GITHUB_APM_PAT` or `GITLAB_APM_TOKEN` to non-GitHub / | ||
| non-GitLab hosts. Authentication falls through to the host's | ||
| `*_APM_PAT` (e.g. `ADO_APM_PAT`) or your local | ||
| `git credential-manager`. See | ||
| non-GitLab hosts. Authentication falls through to the host's local git | ||
| credential helper when one is configured. See |
| - `apm marketplace add` now accepts Anthropic-compatible git URL `#ref`, local | ||
| file, and hosted `marketplace.json` sources, recording hosted JSON provenance | ||
| in the lockfile. (closes #676) |
Relocate the generic download+extract helper from src/apm_cli/marketplace/archive.py to src/apm_cli/utils/archive.py so a future URL-package installer can reuse it without importing marketplace internals. The module was already marketplace-free; this makes the decoupling structural. Zip-slip / path-traversal / decompression-bomb guards stay in the helper so they apply to every caller. Honors #692 forward-compat constraint 3 (standalone archive extraction). No production caller imported the helper yet; the only importer (the safety test) moves to tests/unit/utils/test_archive.py. Refs #692 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Folds copilot-pull-request-reviewer follow-ups: x-tar format detection, post-redirect format detection, type hint, GITLAB_APM_PAT doc fix, changelog PR number. Refs #676 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 1 | 1 | Clean additive design; two separate remote-URL detection heuristics could converge. |
| CLI Logging Expert | 0 | 0 | 1 | Warnings are ASCII, symbol-tagged, and actionable. |
| DevX UX Expert | 0 | 0 | 1 | #ref parity + unpinned-ref nudge match the Anthropic mental model. |
| Supply Chain Security | 0 | 2 | 1 | Live fetch path is safe; archive bomb-guard trusts declared sizes (not yet wired). |
| OSS Growth Hacker | 0 | 0 | 1 | Anthropic parity is an adoption story; contributor credit is healthy. |
| Test Coverage Expert | 0 | 1 | 1 | 3 source shapes + provenance well covered; size-cap guard untested. |
| Auth Expert | 0 | 0 | 1 | #692 fence confirmed: no private-endpoint auth wired in Phase 1. |
| Doc Writer | 0 | 0 | 1 | CHANGELOG + consumer + reference + guide skill all updated consistently. |
| Performance Expert | 0 | 0 | 1 | ETag/304 conditional refresh is a sensible bandwidth win. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 4 follow-ups
- [Supply Chain Security] When
archive.pyis wired into install execution under [FEATURE] URL-based marketplace work Extension #692, enforce actual-bytes accounting (and prefer streaming) rather than trustingmember.size/info.file_size-- a malicious archive can under-declare its header sizes and slip past the decompression-bomb cap. Not blocking now because the helper is salvaged plumbing not yet on any execution path. - [Test Coverage Expert] Add a decompression-bomb test that drives
_extract_zip/_extract_tar_gzpast_MAX_UNCOMPRESSED_BYTES; the traversal, symlink, format, and redirect guards are tested but the size cap is the one guard with no assertion. - [Python Architect] Fold the CLI
_is_remote_marketplace_json_url(path endswith/marketplace.json) andMarketplaceSource.is_remote_manifest_url(path == "") into one shared classifier so the "is this a hosted JSON URL" decision has a single source of truth. - [Supply Chain Security]
_fetch_url_directalready acceptsexpected_digestbutfetch_marketplacenever passes it; threading a stored digest through refresh would give TOFU pinning -- defer to [FEATURE] URL-based marketplace work Extension #692 since Phase 1 is provenance-only.
Architecture
classDiagram
class MarketplaceSource {
+str name
+str url
+str ref
+str path
+kind() str
+is_remote_manifest_url() bool
+display_source() str
}
class FetchResult {
+dict data
+str digest
+str etag
+str last_modified
}
class MarketplaceManifest {
+str source_url
+str source_digest
+find_plugin() MarketplacePlugin
}
class MarketplacePluginResolution {
+str source_url
+str source_digest
+provenance(name, plugin) dict
}
class LockedDependency {
+str source_url
+str source_digest
+to_dict() dict
+from_dict() LockedDependency
}
MarketplaceSource --> FetchResult : url kind fetch
FetchResult --> MarketplaceManifest : parse_marketplace_json
MarketplaceManifest --> MarketplacePluginResolution : resolve
MarketplacePluginResolution --> LockedDependency : provenance()
class MarketplaceSource:::changed
class FetchResult:::added
class MarketplacePluginResolution:::changed
class LockedDependency:::changed
flowchart TD
A[apm marketplace add SOURCE] --> B[_split_source_fragment_ref]
B --> C{shape discriminator}
C -->|https git url| D[git probe + GitCache]
C -->|local dir or file| E[_fetch_local / _fetch_local_file]
C -->|hosted marketplace.json| F[_fetch_url_direct HTTPS-only]
F --> G[size cap + digest + redirect guard]
E --> H[ensure_path_within containment]
G --> I[parse_marketplace_json]
H --> I
D --> I
I --> J[MarketplaceManifest source_url/source_digest]
J --> K[MarketplacePluginResolution.provenance]
K --> L[apm.lock.yaml source_url/source_digest]
sequenceDiagram
participant CLI as marketplace add
participant C as client.fetch_marketplace
participant N as _fetch_url_direct
participant Cache as sidecar cache
CLI->>C: url-kind MarketplaceSource
C->>Cache: read stale meta (etag/last-modified)
C->>N: GET with conditional headers
alt 304 Not Modified
N-->>C: None
C->>Cache: serve stale + rewrite meta
else 200 OK
N-->>C: FetchResult(data, sha256 digest)
C->>Cache: write data + digest + validators
end
C-->>CLI: MarketplaceManifest(source_url, source_digest)
Recommendation
Ship with the four follow-ups tracked. Nothing blocks merge: CI is green, the live hosted-JSON fetch path is HTTPS-only and bounded, and every #692 forward-compat constraint holds. The single highest-signal item to carry into #692 is follow-up 1 -- harden archive.py's decompression-bomb accounting to use actual extracted bytes before that helper is wired onto a real install path.
Full per-persona findings
Python Architect
- [recommended] Two independent heuristics decide "is this a hosted marketplace.json URL":
_is_remote_marketplace_json_urlincommands/marketplace/__init__.pycheckspath.endswith("/marketplace.json"), whileMarketplaceSource.is_remote_manifest_urlinmarketplace/models.pycheckspath == "" and https. They agree today because the CLI setspath="", but the divergent predicates are a latent drift hazard.
Suggested: extract one shared classifier (or have the CLI defer entirely to the model property) so the url-kind decision has a single definition. - [nit]
MarketplacePluginResolution.provenance()centralizing the install-provenance payload is the right call -- it removes the inline dict fromcommands/install.pyand gives [FEATURE] URL-based marketplace work Extension #692 a single seam to extend.
CLI Logging Expert
- [nit] Output is consistent with the house style:
logger.warning(..., symbol="warning")for the unpinned-ref nudge,logger.start(..., symbol="gear")before the slow probe, andverbose_detailnow readsfetch_source.kindso url-kind sources report correctly. All ASCII, no Unicode.
DevX UX Expert
- [nit] The
#reffragment parsing plus the actionable unpinned-ref warning (Pin this git marketplace ... #v1.0.0) match the Anthropic mental model and steer users toward reproducible pins without hard-failing.--ref+#refconflict is a clean hard error rather than a silent pick.
Supply Chain Security Expert
- [recommended]
archive.pydecompression-bomb accounting sumsmember.size(tar) /info.file_size(zip), both header-declared values, and then doesdst.write(src.read())reading the whole member into memory. A crafted archive can under-declare sizes to pass the_MAX_UNCOMPRESSED_BYTEScheck. Not blocking: the helper is salvaged Feature/676 url based marketplace #691 plumbing and is not wired into any Phase 1 execution path.
Suggested: when [FEATURE] URL-based marketplace work Extension #692 wires extraction in, account actual bytes read (chunked) and abort mid-stream past the cap. - [recommended]
_fetch_url_directacceptsexpected_digestand verifies it, butfetch_marketplacenever supplies one, so refreshes are not TOFU-pinned to a previously seen digest. Acceptable for a provenance-only Phase 1; pinning belongs to [FEATURE] URL-based marketplace work Extension #692. - [nit] The live hosted-JSON path is solid: HTTPS-only scheme check, post-redirect non-HTTPS rejection, Content-Length and actual-length 10MB caps,
sha256:<hex>digest, JSON-root-is-object validation, andensure_path_withincontainment on the local-file read. Fails closed at each gate.
OSS Growth Hacker
- [nit] Exact
marketplace.jsonparity with Anthropic is a migration-unblocking adoption story worth surfacing in release notes, and the explicit @Vicente-Pastor credit for the Feature/676 url based marketplace #691 foundation is the kind of contributor-funnel signal that compounds.
Test Coverage Expert
- [recommended] The
_MAX_UNCOMPRESSED_BYTESguard inarchive.pyhas no test.test_archive.pycovers path traversal, symlink rejection, uncompressed-tar rejection, redirect-URL format detection, and safe extraction -- but nothing drives a member past the size cap.
Suggested: addtest_extract_zip_rejects_decompression_bombassertingArchiveErrorpast the limit. - [nit] The three board-mandated source shapes each have a direct test (
test_add_git_url_fragment_sets_ref_and_canonical_url,test_add_local_marketplace_json_file,test_add_remote_marketplace_json_url_fetches_directly) plus client-level digest/ETag/conditional-header coverage and lockfile provenance round-trip. Scenario coverage matches the brief.
Auth Expert
- [nit] [FEATURE] URL-based marketplace work Extension #692 fence confirmed to hold:
_fetch_url_directsends only{"User-Agent": "apm-cli"}-- no token, no AuthResolver. The url-kind branch infetch_marketplacedoes not passauth_resolveronward, and docs explicitly state hosted JSON URLs are public HTTPS only this phase. No private-endpoint credential wiring leaked into Phase 1.
Doc Writer
- [nit] Documentation is consistent and drift-free: CHANGELOG
[Unreleased]entry,consumer/installing-from-marketplaces.md,reference/cli/marketplace.md(flag table + trust-boundary note), and theapm-usage/commands.mdguide skill all describe the new shapes and the public-only Phase 1 fence.
Performance Expert
- [nit] ETag / Last-Modified conditional refresh with a 304 stale-serve path avoids re-downloading unchanged catalogs, and the url-kind cache key (
url__<sha256[:16]>) cleanly namespaces hosted sources. Sensible bandwidth win with no hot-path regression.
This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.
The decompression-bomb guard summed header-declared member sizes (member.size / info.file_size) and then buffered each member whole via dst.write(src.read()). A crafted archive can under-declare its header sizes to slip past a header-based cap, and a single huge member was read into memory regardless. Stream each member in chunks, count the ACTUAL bytes read, and abort mid-stream once the cumulative total exceeds the cap. Adds zip + tar.gz regression tests (mutation-break verified) exercising the byte-counting limit. Addresses apm-review-panel follow-ups 1 (Supply Chain) and 2 (Test Coverage) on #1739. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two predicates decided "is this a hosted marketplace.json URL": the CLI _is_remote_marketplace_json_url checked path endswith /marketplace.json, while MarketplaceSource.is_remote_manifest_url checked path == "" plus https. They agreed only because the CLI sets path="", a latent drift hazard. Both now delegate to one shared url_names_remote_manifest() in marketplace.models (HTTPS + host + path endswith /marketplace.json). The source-kind discriminator stays a derived, validated string (#692 constraint 1), not a closed enum. Also corrects a stray GITLAB_APM_TOKEN code comment to GITLAB_APM_PAT. Addresses apm-review-panel follow-up 3 (Python Architect) and the Copilot env-var-name signal on #1739. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Shepherd-driver follow-up pass: ready-to-merge (advisory)A fold-in convergence pass folded the apm-review-panel Folded in this run
Copilot signals reviewedThe five Copilot inline comments were all addressed in this PR:
Deferred (out-of-scope follow-up)
Regression-trap evidence (mutation-break gate)
#692 forward-compat constraintsAll six still hold after this pass: (1) source-kind discriminator is a derived Lint
CIGreen on head
Ready for maintainer review. No auto-merge. |
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 0 | 3 | Clean new archive helper and marketplace URL fetcher; three minor architectural nits on memory cap, containment semantics, and streaming. |
| CLI Logging Expert | 0 | 0 | 2 | No blocking CLI output concerns; two nits on verbose label naming and warning ordering. |
| DevX UX Expert | 0 | 0 | 3 | Hosted JSON URL support is ergonomic; three minor UX gaps around alias defaults, silent flag ignoring, and docs applicability. |
| Supply Chain Security Expert | 0 | 2 | 1 | HTTPS enforcement, traversal guards, and bomb limits are solid; missing test for archive HTTP rejection and unbounded response.content read in archive.py are the gaps. |
| OSS Growth Hacker | 0 | 0 | 2 | Marketplace source parity is a strong growth beat; CHANGELOG benefit framing and docs phase wording can be tightened. |
| Auth Expert | 0 | 0 | 0 | No auth surface touched. _fetch_url_direct sends only User-Agent over mandatory HTTPS with redirect-scheme validation. AuthResolver, tokens, and credential helpers are not in scope for this path. |
| Doc Writer | 0 | 2 | 1 | Docs are accurate and well-scoped; two drift gaps: lockfile spec omits new source_url/source_digest fields, and generic-git auth text drops the still-live ADO_APM_PAT path. |
| Test Coverage Expert | 0 | 0 | 0 | Schema failure -- see extras. |
| Performance Expert | 0 | 1 | 2 | ETag/304 conditional fetch is well-implemented; archive helper buffers full payload in RAM instead of streaming -- flag before hot-path adoption. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 4 follow-ups
- [Supply Chain Security Expert] Add unit test asserting download_and_extract_archive rejects non-HTTPS URLs -- The HTTPS-only guard at archive.py L158 is a secure-by-default promise with no regression trap. A one-line pytest.raises test closes the gap.
- [Doc Writer] Document source_url and source_digest in lockfile-spec.md per-entry field table -- The lockfile spec is the authoritative reference and the marketplace docs link readers there. Missing fields create silent drift on a portable-by-manifest contract.
- [Python Architect] Stream archive download to disk or add TODO for [FEATURE] URL-based marketplace work Extension #692 before utils/archive becomes a hot-path helper -- Three panelists flagged the same response.content double-buffer pattern. Not exploitable today but the helper is being extracted for reuse; fix before it propagates.
- [Doc Writer] Restore *_APM_PAT fallthrough wording in cli/marketplace.md generic-git auth text -- Edited docs dropped ADO_APM_PAT from the generic section while the ADO section still references it, creating an internal contradiction on a live auth mechanism.
Architecture
classDiagram
direction LR
class MarketplaceSource {
<<ValueObject>>
+name str
+owner str
+repo str
+host str
+kind str
+url str
+ref str
}
class FetchResult {
<<ValueObject>>
+data dict
+digest str
+etag str
+last_modified str
}
class MarketplaceManifest {
<<ValueObject>>
+plugins list
}
class ArchiveError {
<<Exception>>
}
class MarketplaceFetchError {
<<Exception>>
}
class AuthResolver {
<<Strategy>>
+try_with_fallback(host, fn)
+classify_host(host) HostInfo
}
class fetch_marketplace {
<<Facade>>
}
class download_and_extract_archive {
<<Facade>>
}
fetch_marketplace ..> MarketplaceSource : reads
fetch_marketplace ..> FetchResult : uses
fetch_marketplace ..> MarketplaceManifest : returns
fetch_marketplace ..> AuthResolver : delegates auth for git paths
download_and_extract_archive ..> ArchiveError : raises
note for FetchResult "Frozen dataclass value object"
note for fetch_marketplace "Fetcher dispatch keyed by source.kind"
class FetchResult:::touched
class download_and_extract_archive:::touched
classDef touched fill:#fff3b0,stroke:#d47600
flowchart TD
A["CLI: apm marketplace add"] --> B{"source.kind?"}
B -->|url| C["_fetch_url_direct - HTTPS GET with ETag and Last-Modified"]
B -->|github/gitlab| D["_fetch_file - API plus auth fallback"]
B -->|local| E{"path is file?"}
B -->|git| F["git-backed marketplace.json fetch"]
E -->|yes| G["_fetch_local_file - direct read plus containment guard"]
E -->|no| H["_fetch_local_direct_read - resolve marketplace.json candidates"]
C --> I{"HTTP 304?"}
I -->|yes| J["Serve stale cache"]
I -->|no| K["Validate digest and size cap"]
K --> L["_write_cache - data and meta"]
L --> M["parse_marketplace_json"]
G --> M
H --> M
D --> M
M --> P["MarketplaceManifest source_url/source_digest"]
subgraph Archive["utils/archive.py"]
Q["download_and_extract_archive"] --> R{"detect format"}
R -->|tar.gz| S["_extract_tar_gz guarded"]
R -->|zip| T["_extract_zip guarded"]
end
Recommendation
The PR is well-architected, correctly scoped, and all panelists agree on direction. The two highest-signal follow-ups (HTTPS-rejection test and lockfile-spec doc update) are small, bounded tasks that can land as a fast follow or be folded into this PR at the author's discretion. Neither represents a user-facing regression today. Ship and track the four follow-ups above.
Full per-persona findings
Python Architect
- [nit] response.content buffers entire archive into memory before streaming extraction at
src/apm_cli/utils/archive.py:163
download_and_extract_archive calls response.content which materializes the full body into RAM, then wraps it in io.BytesIO for streaming extraction. For large archives near the 512 MB cap this doubles peak memory. Using iter_content with a tempfile or requests stream=True piped directly into the tarfile/zipfile would bound RSS. Acceptable at current 512 MB ceiling but worth a TODO comment for [FEATURE] URL-based marketplace work Extension #692 forward-compat where archives grow.
Suggested: Add a # TODO([FEATURE] URL-based marketplace work Extension #692): stream response to disk to avoid 2x peak RSS for large archives comment, or switch to stream=True plus SpooledTemporaryFile now. - [nit] _fetch_local_file containment check is a tautology at
src/apm_cli/marketplace/client.py:595
ensure_path_within(manifest_file, manifest_file.parent) always passes for any file that exists because a file is always within its own parent directory. The guard was likely intended to validate that the user-supplied path does not escape the project root or a configured base directory, not just its own parent. As written it cannot catch path escapes.
Suggested: Pass the project root or APM workspace dir as the containment boundary instead of manifest_file.parent, or document why parent is the intended single-file trust boundary. - [nit] _MAX_MARKETPLACE_JSON_BYTES cap not enforced until after response.content allocation at
src/apm_cli/marketplace/client.py:258
_fetch_url_direct reads resp.content (full body in RAM) before comparing len(raw) > _MAX_MARKETPLACE_JSON_BYTES. An adversarial server ignoring Content-Length can force allocation before the check fires. Bounded at 10 MB when Content-Length is honest, but streaming would make the guard deterministic.
Suggested: Use stream=True and read bounded chunks up to _MAX_MARKETPLACE_JSON_BYTES before decoding.
CLI Logging Expert
- [nit] Raw 'Kind: url' verbose label is opaque to humans
The verbose output shows 'Kind: url' which is an internal enum value leak. CLI output should use user-facing language like 'Source: remote URL'.
Suggested: Map internal kind enum to user-facing labels before rendering, e.g. url -> remote URL, git -> git repo, local -> local path. - [nit] Unpinned git URL warning ordering reads slightly odd
The warning about unpinned git URLs appears around the action start rather than before work begins or in the final caveat section. This is cosmetic; the message is actionable and ASCII-safe.
Suggested: Emit the warning before logger.start or collect it for the post-resolution summary.
DevX UX Expert
- [nit] Hostname-only default alias may collide when multiple hosted JSON URLs share the same host
If a user registers both https://example.com/foo/marketplace.json and https://example.com/bar/marketplace.json, the default alias derived from hostname alone would collide. This only affects multi-registry power users but is surprising.
Suggested: Derive the default alias from hostname plus first meaningful path segment, or ask users to choose when a collision is detected. - [nit] --host flag is silently ignored for hosted JSON URLs without warning
When a user passes --host to a hosted JSON URL, the URL already carries the host, so the flag has no effect. Silent flag ignoring makes diagnosis harder.
Suggested: Emit a warning when --host is provided but the resolved source is a hosted marketplace.json URL. - [nit] Docs do not clarify that --ref applies to git sources only
The --ref flag is meaningful for git-backed sources but not for a static hosted JSON file. Flag applicability should be explicit to avoid user confusion.
Suggested: Add '(git sources only)' to the --ref docs/help text where practical.
Supply Chain Security Expert
- [recommended] download_and_extract_archive buffers entire response.content without Content-Length pre-check or streaming cap at
src/apm_cli/utils/archive.py:163
Unlike _fetch_url_direct in client.py which pre-checks Content-Length before accessing resp.content, archive.py calls response.content unconditionally. A malicious server can stream gigabytes into RAM before the decompression-bomb guard fires. The _MAX_UNCOMPRESSED_BYTES guard only applies post-decompress; the raw compressed payload has no cap. This is not blocking because it requires a malicious HTTPS server the user explicitly configured, but it weakens the fail-closed posture.
Suggested: Add a Content-Length pre-check and/or use iter_content with a running byte counter before buffering into memory. Cap at a reasonable compressed-payload limit. - [recommended] No test asserts download_and_extract_archive rejects non-HTTPS URLs at
tests/unit/utils/test_archive.py
The HTTPS-only guard at archive.py L158 is a critical security boundary preventing MITM archive injection but has no dedicated unit test. The traversal, symlink, and bomb guards all have tests. This is a regression-trap gap on a secure-by-default promise.
Suggested: Add test_download_rejects_http_url with pytest.raises(ArchiveError, match='HTTPS') for an http:// archive URL.
Proof (missing at): `` -- proves: Archive download rejects non-HTTPS URLs to prevent MITM archive injection [secure-by-default] - [nit] _fetch_local_file containment check needs intent documented or broader boundary at
src/apm_cli/marketplace/client.py:597
ensure_path_within(manifest_file, manifest_file.parent) will pass for any resolved Path because a file is within its own parent directory. This may still document symlink-resolution intent, but the trust boundary is not obvious.
OSS Growth Hacker
- [nit] CHANGELOG entry leads with implementation detail instead of user benefit at
CHANGELOG.md
Users scanning CHANGELOG decide whether to upgrade based on value framing, not protocol taxonomy. Lead with add any marketplace by URL, local path, or hosted JSON catalog.
Suggested: Rewrite lead sentence around user outcome: apm marketplace add now works with HTTPS git repos, local paths, and hosted marketplace.json catalogs. - [nit] Docs expose internal Phase 1 roadmap language to end users at
docs/src/content/docs/consumer/installing-from-marketplaces.md
Visitors do not know what Phase 1 means or how many phases exist. State the current public/local scope as a product fact and point to the tracked follow-up.
Suggested: Replace phase language with: Hosted marketplace.json URLs currently use public HTTPS without custom auth headers; private URL authentication is tracked separately.
Auth Expert
No findings.
Doc Writer
- [recommended] Lockfile spec reference does not document the new source_url / source_digest per-entry fields the PR adds. at
docs/src/content/docs/reference/lockfile-spec.md:143
lockfile-spec.md is the authoritative per-entry field table. The PR adds source_url and source_digest to LockedDependency, serializes them in to_dict/from_dict, and the marketplace docs point readers to the lockfile reference. The canonical reference never defines the new fields.
Suggested: Add two rows to the per-entry table after marketplace_plugin_name for source_url and source_digest. - [recommended] Generic-git auth fallthrough text omits ADO_APM_PAT / *_APM_PAT. at
docs/src/content/docs/reference/cli/marketplace.md:122
Edited docs replaced the host *_APM_PAT path with local git credential helper only, but ADO_APM_PAT is still a live mechanism and the same page's ADO section mentions it. The generic statement is incomplete and conflicts with the ADO guidance.
Suggested: Restore host PAT phrasing, e.g. authentication falls through to the host's *_APM_PAT (such as ADO_APM_PAT) or local git credential helper. Mirror the wording in installing-from-marketplaces.md. - [nit] Intro frames the feature as parity with another product rather than APM's own capability. at
docs/src/content/docs/consumer/installing-from-marketplaces.md:67
User-facing docs should be self-contained and durable. Comparative parity framing is okay in PR/release notes but less useful in task docs.
Suggested: Rephrase to state APM accepts several source shapes beyond GitHub repos.
Test Coverage Expert
No findings.
Performance Expert
- [recommended] Archive download buffers entire response in memory before extraction, risking high peak RSS. at
src/apm_cli/utils/archive.py:157
download_and_extract_archive calls response.content (full body in RAM), then wraps it for tarfile/zipfile. With _MAX_UNCOMPRESSED_BYTES at 512 MB, peak RSS can be high. The extraction guard is correct, but the initial download still materializes the compressed payload. Not blocking because this code path is not yet invoked on the install hot path.
Suggested: Use a streaming or bounded download path before extraction; for zip, use a bounded spool-to-file strategy because zip requires random access. - [nit] No requests.Session reuse across marketplace fetches. at
src/apm_cli/marketplace/client.py:243
_fetch_url_direct uses bare requests.get, creating a new connection per source. For the single-source add flow this is negligible; bulk operations would benefit from connection reuse. - [nit] Content-Length guard reads header but still downloads full body on oversized response when header is absent or wrong. at
src/apm_cli/marketplace/client.py:253
_fetch_url_direct checks Content-Length, then resp.content. A streaming read with a running counter would abort early. Marginal for the 10 MB marketplace.json cap but good hygiene.
This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ketplace-source-parity Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> # Conflicts: # CHANGELOG.md
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 0 | 0 | Strategy dispatch and standalone archive utility are ready. |
| CLI Logging Expert | 0 | 0 | 0 | Help, warning, and verbose output now explain the new source shapes. |
| DevX UX Expert | 0 | 0 | 0 | Source-shape UX is discoverable and scoped to git/local/hosted JSON behavior. |
| Supply Chain Security Expert | 0 | 0 | 0 | HTTPS-only, redirect rejection, cookie clearing, digest, and staging safeguards are covered. |
| OSS Growth Hacker | 0 | 0 | 0 | Changelog and docs lead with the user value. |
| Auth Expert | 0 | 0 | 0 | URL-kind remains credential-free; git-backed auth fallthrough is documented. |
| Doc Writer | 0 | 0 | 0 | Reference and consumer docs match the code surface. |
| Test Coverage Expert | 0 | 0 | 0 | Regression tests cover streaming, redirects, cookie clearing, staging, aliasing, and provenance units. |
| Performance Expert | 0 | 0 | 0 | Streaming and session reuse are appropriate for Phase 1 volume. |
Counts are signal strength, not gates. The maintainer ships.
Recommendation
Ship now. No in-scope foldable follow-ups remain for #676 Phase 1.
Full per-persona findings
Python Architect
No findings.
CLI Logging Expert
No findings.
DevX UX Expert
No findings.
Supply Chain Security Expert
No findings.
OSS Growth Hacker
No findings.
Auth Expert
No findings.
Doc Writer
No findings.
Test Coverage Expert
No findings.
Performance Expert
No findings.
This panel is advisory. Re-apply the panel-review label after new changes if another pass is needed.
feat(marketplace): add source parity for marketplace add
TL;DR
apm marketplace addnow accepts the three Anthropic-compatible source shapes missing from APM: HTTPS git URLs with#ref, local directory or file paths, and hostedmarketplace.jsonURLs. The implementation keeps Phase 1 public + local only, records hosted JSON provenance inapm.lock.yaml, and carries over the transferable safety plumbing from #691. Credit to @Vicente-Pastor for the foundational design and plumbing in #691.Note
Closes #676. AuthResolver wiring for private URL endpoints remains Phase 2 (#692), and
.well-known/agent-skillsstays out of scope for this slice.Problem (WHY)
addsource parity with the Anthropic spec #676 says to accept the missing shapes: “git URL with#ref, local path to a dir or file, and a remotemarketplace.jsonURL.”addsource parity with the Anthropic spec #676 explicitly fences auth and.well-knownwork, aligning with Agent Skills guidance to “Add what the agent lacks, omit what it knows.”Approach (WHAT)
apm marketplace addthrough a source-shape discriminator for git URL fragments, local file paths, and hosted JSON URLs.marketplace.jsondirectly over HTTPS with digest calculation, ETag / Last-Modified metadata, and stale-cache fallback.source_urlandsource_digestto marketplace provenance so installed packages can record it inapm.lock.yaml.assets/and reference them fromSKILL.mdso they only load when needed.”Implementation (HOW)
src/apm_cli/commands/marketplace/__init__.pynow strips HTTPS#reffragments intoref, detects direct hostedmarketplace.jsonURLs, skips git path probing for direct JSON/local files, and warns when HTTPS git URLs use the implicit mutable default ref.src/apm_cli/marketplace/client.pyaddsFetchResult, HTTPS-only direct JSON fetch, digest calculation, ETag / Last-Modified conditional refresh, direct local-file reads guarded byensure_path_within(), and URL-aware cache keys.MarketplaceSourcederives aurlkind for hosted JSON documents;MarketplaceManifestcarriessource_url/source_digest;MarketplacePluginResolution.provenance()centralizes the install provenance payload.LockedDependencyserializes additivesource_url/source_digestfields, andLockfileBuilderattaches them when packages originate from a hosted JSON marketplace.src/apm_cli/utils/archive.pyadds reusable zip / tar.gz extraction helpers with path traversal, absolute path, symlink / hardlink, redirect, and decompression-size guards. Lives underutils/(notmarketplace/) so a future URL-package installer reuses it without importing marketplace internals (#692 constraint 3).apm marketplace addsurface.Diagrams
Legend: The diagram shows how the new
apm marketplace addsource shapes flow into existing manifest parsing and lockfile provenance.flowchart LR subgraph AddCommand[apm marketplace add] SourceArg[SOURCE argument] Shape[shape discriminator] end subgraph Fetch[Fetch path] Git[git repo marketplace.json] Local[local dir or file] Remote[hosted marketplace.json] end subgraph Provenance[Install provenance] Manifest[MarketplaceManifest] Lock[apm.lock.yaml] end SourceArg --> Shape Shape --> Git Shape --> Local Shape --> Remote Git --> Manifest Local --> Manifest Remote --> Manifest Manifest --> Lock classDef new stroke-dasharray: 5 5; class Shape,Remote,Lock new;Trade-offs
#refis only parsed for HTTPS git URLs. Chose a narrow discriminator; rejected fragment parsing for local paths or hosted JSON where fragments are not a git ref contract.path == "". Chose an additive sentinel on the existingMarketplaceSource; rejected a larger registry schema migration for this phase.addsource parity with the Anthropic spec #676.Benefits
apm marketplace add https://gitlab.com/acme/catalog.git#v1.2.3stores the canonical URL without the fragment and usesv1.2.3as the ref.apm marketplace add ./vendor/marketplace.json --name vendorregisters a local file without treating it as a git repo.apm marketplace add https://catalog.example.com/marketplace.json --name catalogfetches the JSON directly over HTTPS and caches validator metadata.source_urlandsource_digestinapm.lock.yaml.../escape.txtand symlinks fail closed before extraction.Validation
uv run --extra dev pytest tests/unit -q:Unit test output
uv run --extra dev ruff check src/ tests/ && uv run --extra dev ruff format --check src/ tests/ && uv run --extra dev python -m pylint --disable=all --enable=R0801 --min-similarity-lines=10 --fail-on=R0801 src/apm_cli/ && bash scripts/lint-auth-signals.sh:Portable CI grep guards for YAML I/O, file length, and raw
str(path.relative_to(...)):npx --yes -p @mermaid-js/mermaid-cli mmdc -i .mermaid-check/diag1.mmd -o .mermaid-check/diag1.svg --quiet:Scenario Evidence
#refand get a stable ref instead of a fragment-bearing URL.tests/unit/marketplace/test_marketplace_commands.py::TestMarketplaceAdd::test_add_git_url_fragment_sets_ref_and_canonical_urlmarketplace.jsonURL without publishing a git repo.tests/unit/marketplace/test_marketplace_commands.py::TestMarketplaceAdd::test_add_remote_marketplace_json_url_fetches_directlytests/unit/marketplace/test_marketplace_client.py::TestFetchMarketplace::test_fetch_remote_marketplace_url_records_digest_and_etagmarketplace.jsonfile or directory without escaping the allowed filesystem root.tests/unit/marketplace/test_marketplace_commands.py::TestMarketplaceAdd::test_add_local_marketplace_json_filetests/unit/marketplace/test_client_local.py::test_fetch_local_file_direct_readtests/unit/marketplace/test_lockfile_provenance.py::TestLockedDependencyProvenance::test_lockfile_builder_attaches_marketplace_source_provenancetests/unit/marketplace/test_marketplace_install_integration.py::TestMarketplaceResolutionProvenance::test_resolution_includes_manifest_source_url_and_digesttests/unit/utils/test_archive.py::test_extract_zip_rejects_path_traversaltests/unit/utils/test_archive.py::test_extract_tar_gz_rejects_symlinkForward-compat constraints (#692)
Important
#692 (URL-source package install:
apm install <full-package-URL>with digest pinning + private-endpoint auth) is deferred Phase 2 work, not implemented here. The board bound its 6 forward-compat constraints onto this PR so #676's primitives do not foreclose it. Each is honored:source_typediscriminator is an open-ended validated string, not a closedgithub/urlenum.MarketplaceSource.kindis a derived host-classified string (local/url/github/gitlab/git);to_dict/from_dictpersisturl/ref/path/owner/repo/host(neverkind), sourl-package/ OCI / ADO add a value without a schema migration._fetch_url_direct()is HTTPS-only, returns raw bytes +sha256:digest + ETag/Last-Modified, and bakes in no Agent-Skills / marketplace-index parsing; parsing happens separately inparse_marketplace_json.marketplace/archive.py->utils/archive.py; zip-slip / traversal / decompression-bomb guards stay in the helper; no marketplace import.sha256:<hex>.LockedDependency/MarketplaceManifestcarrysource_url/source_digest; digest computed as"sha256:" + sha256(raw).hexdigest()._fetch_url_directand the parser; the current per-entry parser already dispatches on inspected keys.resolve_url_package()is a parallel path, not a fork.url-kind sources resolve through the sameMarketplacePluginResolution.provenance()surface as other kinds; a future package installer is a parallel function, not a special-case branch insideresolve_marketplace_plugin.Re-validated after the archive move:
uv run --extra dev pytest tests/unit/utils/test_archive.py tests/unit/marketplace -q->1512 passed; full CI-mirror lint chain green.How to test
apm marketplace add https://gitlab.com/acme/catalog.git#v1.2.3 --name catalogand confirmapm marketplace listshows refv1.2.3.apm marketplace add ./vendor/marketplace.json --name vendorand confirmapm marketplace browse vendorlists plugins from the file.apm marketplace add https://catalog.example.com/marketplace.json --name catalogagainst a public test catalog and confirm it registers without cloning git.tool@catalogand inspectapm.lock.yamlforsource_urlandsource_digeston the locked dependency.apm marketplace update catalogand confirm cached hosted JSON sources refresh through the same command path.Closes #676
Refs #692
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com