Skip to content

feat(marketplace): add source parity for marketplace add#1739

Merged
danielmeppiel merged 9 commits into
mainfrom
danielmeppiel/676-marketplace-source-parity
Jun 11, 2026
Merged

feat(marketplace): add source parity for marketplace add#1739
danielmeppiel merged 9 commits into
mainfrom
danielmeppiel/676-marketplace-source-parity

Conversation

@danielmeppiel

@danielmeppiel danielmeppiel commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

feat(marketplace): add source parity for marketplace add

TL;DR

apm marketplace add now accepts the three Anthropic-compatible source shapes missing from APM: HTTPS git URLs with #ref, local directory or file paths, and hosted marketplace.json URLs. The implementation keeps Phase 1 public + local only, records hosted JSON provenance in apm.lock.yaml, and carries over the transferable safety plumbing from #691. Credit to @Vicente-Pastor for the foundational design and plumbing in #691.

Note

Closes #676. AuthResolver wiring for private URL endpoints remains Phase 2 (#692), and .well-known/agent-skills stays out of scope for this slice.

Problem (WHY)

  • APM only registered repo-shaped marketplaces; issue Marketplace add source parity with the Anthropic spec #676 says to accept the missing shapes: “git URL with #ref, local path to a dir or file, and a remote marketplace.json URL.”
  • Direct hosted JSON sources had no source digest or canonical source provenance to carry into installed dependencies; PROSE’s validation standard says “Grounding outputs in deterministic tool execution transforms probabilistic generation into verifiable action.”
  • Local path and archive inputs expand the user-controlled filesystem surface, so path containment and zip-slip guards need to be concrete rather than implicit; Agent Skills guidance says agents “pattern-match well against concrete structures.”
  • [!] Phase 1 must stay bounded: issue Marketplace add source parity with the Anthropic spec #676 explicitly fences auth and .well-known work, aligning with Agent Skills guidance to “Add what the agent lacks, omit what it knows.”

Approach (WHAT)

# Fix Principle
1 Route apm marketplace add through a source-shape discriminator for git URL fragments, local file paths, and hosted JSON URLs. “agents pattern-match well against concrete structures”
2 Fetch hosted marketplace.json directly over HTTPS with digest calculation, ETag / Last-Modified metadata, and stale-cache fallback. “Grounding outputs in deterministic tool execution transforms probabilistic generation into verifiable action.”
3 Keep auth out of direct hosted JSON fetches and preserve existing git-host credential behavior. “Add what the agent lacks, omit what it knows”
4 Attach hosted JSON source_url and source_digest to marketplace provenance so installed packages can record it in apm.lock.yaml. “store them in assets/ and reference them from SKILL.md so they only load when needed.”

Implementation (HOW)

Area What changed
CLI source parsing src/apm_cli/commands/marketplace/__init__.py now strips HTTPS #ref fragments into ref, detects direct hosted marketplace.json URLs, skips git path probing for direct JSON/local files, and warns when HTTPS git URLs use the implicit mutable default ref.
Marketplace fetching src/apm_cli/marketplace/client.py adds FetchResult, HTTPS-only direct JSON fetch, digest calculation, ETag / Last-Modified conditional refresh, direct local-file reads guarded by ensure_path_within(), and URL-aware cache keys.
Models and resolver MarketplaceSource derives a url kind for hosted JSON documents; MarketplaceManifest carries source_url / source_digest; MarketplacePluginResolution.provenance() centralizes the install provenance payload.
Lockfile LockedDependency serializes additive source_url / source_digest fields, and LockfileBuilder attaches them when packages originate from a hosted JSON marketplace.
Archive safety src/apm_cli/utils/archive.py adds reusable zip / tar.gz extraction helpers with path traversal, absolute path, symlink / hardlink, redirect, and decompression-size guards. Lives under utils/ (not marketplace/) so a future URL-package installer reuses it without importing marketplace internals (#692 constraint 3).
Tests and docs Marketplace command, client, local-path, archive, resolver, lockfile, and install integration tests cover the new source shapes; docs and the guide skill now describe the expanded apm marketplace add surface.

Diagrams

Legend: The diagram shows how the new apm marketplace add source shapes flow into existing manifest parsing and lockfile provenance.

flowchart LR
    subgraph AddCommand[apm marketplace add]
        SourceArg[SOURCE argument]
        Shape[shape discriminator]
    end
    subgraph Fetch[Fetch path]
        Git[git repo marketplace.json]
        Local[local dir or file]
        Remote[hosted marketplace.json]
    end
    subgraph Provenance[Install provenance]
        Manifest[MarketplaceManifest]
        Lock[apm.lock.yaml]
    end
    SourceArg --> Shape
    Shape --> Git
    Shape --> Local
    Shape --> Remote
    Git --> Manifest
    Local --> Manifest
    Remote --> Manifest
    Manifest --> Lock
    classDef new stroke-dasharray: 5 5;
    class Shape,Remote,Lock new;
Loading

Trade-offs

  • Direct hosted JSON is public-only. Chose no custom auth headers in Phase 1; rejected early AuthResolver wiring because [FEATURE] URL-based marketplace work Extension #692 owns that surface.
  • #ref is only parsed for HTTPS git URLs. Chose a narrow discriminator; rejected fragment parsing for local paths or hosted JSON where fragments are not a git ref contract.
  • Hosted JSON source uses path == "". Chose an additive sentinel on the existing MarketplaceSource; rejected a larger registry schema migration for this phase.
  • Archive helper is reusable but not wired into install execution here. Chose to salvage the zip-slip-safe plumbing from Feature/676 url based marketplace #691; rejected broad package-download behavior changes outside Marketplace add source parity with the Anthropic spec #676.

Benefits

  1. apm marketplace add https://gitlab.com/acme/catalog.git#v1.2.3 stores the canonical URL without the fragment and uses v1.2.3 as the ref.
  2. apm marketplace add ./vendor/marketplace.json --name vendor registers a local file without treating it as a git repo.
  3. apm marketplace add https://catalog.example.com/marketplace.json --name catalog fetches the JSON directly over HTTPS and caches validator metadata.
  4. Installs from hosted JSON marketplaces can now persist source_url and source_digest in apm.lock.yaml.
  5. Unsafe archive members such as ../escape.txt and symlinks fail closed before extraction.

Validation

uv run --extra dev pytest tests/unit -q:

Unit test output
MCPIntegrator._apply_overlay({"srv": info}, dep)

tests/unit/test_mcp_integrator_phase3w4.py::TestApplyOverlay::test_args_overlay_dict
  /Users/danielmeppiel/Repos/copilot-worktrees/awd-cli/auto-676/tests/unit/test_mcp_integrator_phase3w4.py:266: UserWarning: MCP overlay field 'version' on 'srv' is not yet applied at install time and will be ignored.
    MCPIntegrator._apply_overlay({"srv": info}, dep)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
16869 passed, 2 skipped, 21 xfailed, 19 warnings, 40 subtests passed in 190.45s (0:03:10)

uv run --extra dev ruff check src/ tests/ && uv run --extra dev ruff format --check src/ tests/ && uv run --extra dev python -m pylint --disable=all --enable=R0801 --min-similarity-lines=10 --fail-on=R0801 src/apm_cli/ && bash scripts/lint-auth-signals.sh:

All checks passed!
1239 files already formatted

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

[*] Rule A: get_bearer_provider boundary (any reference)
[*] Rule B: git ls-remote auth-delegated annotation
[+] auth-signal lint clean

Portable CI grep guards for YAML I/O, file length, and raw str(path.relative_to(...)):

(exit 0, no output)

npx --yes -p @mermaid-js/mermaid-cli mmdc -i .mermaid-check/diag1.mmd -o .mermaid-check/diag1.svg --quiet:

(exit 0, no output)

Scenario Evidence

# Scenario (user promise) Principle(s) Test(s) proving it Type
1 Register a git marketplace URL with #ref and get a stable ref instead of a fragment-bearing URL. Vendor-neutral, DevX tests/unit/marketplace/test_marketplace_commands.py::TestMarketplaceAdd::test_add_git_url_fragment_sets_ref_and_canonical_url unit
2 Register a hosted marketplace.json URL without publishing a git repo. Vendor-neutral, DevX tests/unit/marketplace/test_marketplace_commands.py::TestMarketplaceAdd::test_add_remote_marketplace_json_url_fetches_directly
tests/unit/marketplace/test_marketplace_client.py::TestFetchMarketplace::test_fetch_remote_marketplace_url_records_digest_and_etag
unit
3 Register a local marketplace.json file or directory without escaping the allowed filesystem root. Secure by default, DevX tests/unit/marketplace/test_marketplace_commands.py::TestMarketplaceAdd::test_add_local_marketplace_json_file
tests/unit/marketplace/test_client_local.py::test_fetch_local_file_direct_read
unit
4 Preserve hosted JSON provenance in the lockfile after marketplace install resolution. Governed by policy, Secure by default tests/unit/marketplace/test_lockfile_provenance.py::TestLockedDependencyProvenance::test_lockfile_builder_attaches_marketplace_source_provenance
tests/unit/marketplace/test_marketplace_install_integration.py::TestMarketplaceResolutionProvenance::test_resolution_includes_manifest_source_url_and_digest
unit
5 Reject unsafe archive entries before extraction. Secure by default tests/unit/utils/test_archive.py::test_extract_zip_rejects_path_traversal
tests/unit/utils/test_archive.py::test_extract_tar_gz_rejects_symlink
unit

Forward-compat constraints (#692)

Important

#692 (URL-source package install: apm install <full-package-URL> with digest pinning + private-endpoint auth) is deferred Phase 2 work, not implemented here. The board bound its 6 forward-compat constraints onto this PR so #676's primitives do not foreclose it. Each is honored:

# #692 constraint Status in this PR
1 source_type discriminator is an open-ended validated string, not a closed github/url enum. Satisfied. MarketplaceSource.kind is a derived host-classified string (local/url/github/gitlab/git); to_dict/from_dict persist url/ref/path/owner/repo/host (never kind), so url-package / OCI / ADO add a value without a schema migration.
2 HTTPS fetch helper stays generic (plain GET, no index-format assumptions). Satisfied. _fetch_url_direct() is HTTPS-only, returns raw bytes + sha256: digest + ETag/Last-Modified, and bakes in no Agent-Skills / marketplace-index parsing; parsing happens separately in parse_marketplace_json.
3 Archive extraction is standalone, decoupled from the marketplace layer. Satisfied (this commit). Moved marketplace/archive.py -> utils/archive.py; zip-slip / traversal / decompression-bomb guards stay in the helper; no marketplace import.
4 Neutral lockfile provenance names; digest = sha256:<hex>. Satisfied. LockedDependency / MarketplaceManifest carry source_url / source_digest; digest computed as "sha256:" + sha256(raw).hexdigest().
5 Index-format detection stays extensible (apm.yml vs Agent-Skills vs marketplace.json), not a hardcoded branch. Satisfied (structural). Fetch and parse are decoupled, so a future format dispatcher inserts cleanly between _fetch_url_direct and the parser; the current per-entry parser already dispatches on inspected keys.
6 Resolver designed so a future resolve_url_package() is a parallel path, not a fork. Satisfied (structural). url-kind sources resolve through the same MarketplacePluginResolution.provenance() surface as other kinds; a future package installer is a parallel function, not a special-case branch inside resolve_marketplace_plugin.

Re-validated after the archive move: uv run --extra dev pytest tests/unit/utils/test_archive.py tests/unit/marketplace -q -> 1512 passed; full CI-mirror lint chain green.

How to test

  • Run apm marketplace add https://gitlab.com/acme/catalog.git#v1.2.3 --name catalog and confirm apm marketplace list shows ref v1.2.3.
  • Run apm marketplace add ./vendor/marketplace.json --name vendor and confirm apm marketplace browse vendor lists plugins from the file.
  • Run apm marketplace add https://catalog.example.com/marketplace.json --name catalog against a public test catalog and confirm it registers without cloning git.
  • Install tool@catalog and inspect apm.lock.yaml for source_url and source_digest on the locked dependency.
  • Re-run apm marketplace update catalog and confirm cached hosted JSON sources refresh through the same command path.

Closes #676
Refs #692

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 10, 2026 22:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Phase 1 source-shape parity to apm marketplace add, aligning with Anthropic-compatible inputs by supporting HTTPS git URLs with #ref, local directory/file sources, and direct hosted marketplace.json URLs, while carrying hosted JSON provenance into install/lockfile flows.

Changes:

  • Extend marketplace source modeling + fetching to support direct HTTPS marketplace.json retrieval with digest + ETag/Last-Modified caching.
  • Propagate hosted JSON provenance (source_url, source_digest) through resolver/install validation into apm.lock.yaml.
  • Add archive extraction safety helpers + new/updated unit tests and docs to cover the expanded marketplace add surface.
Show a summary per file
File Description
src/apm_cli/commands/marketplace/__init__.py Adds source-shape discrimination (#ref, local file, hosted marketplace.json), cache clearing by source, and UX warnings for unpinned git URLs.
src/apm_cli/marketplace/client.py Implements direct HTTPS fetch path for hosted marketplace.json with digest + conditional refresh metadata and URL-aware cache keys.
src/apm_cli/marketplace/models.py Introduces url kind, carries source_url/source_digest on manifests, and threads provenance through parse_marketplace_json.
src/apm_cli/marketplace/resolver.py Carries manifest provenance into MarketplacePluginResolution and centralizes provenance payload building.
src/apm_cli/commands/install.py Uses resolver-provided provenance payload during marketplace install validation.
src/apm_cli/deps/lockfile.py Adds source_url/source_digest fields to LockedDependency serialization/deserialization.
src/apm_cli/install/phases/lockfile.py Attaches marketplace provenance (including hosted JSON provenance) onto lockfile dependencies.
src/apm_cli/marketplace/archive.py Adds safe archive download/extraction helpers with traversal/link/size guards.
tests/unit/marketplace/test_marketplace_commands.py Adds CLI tests for #ref parsing, unpinned warnings, local file add, and hosted marketplace.json add.
tests/unit/marketplace/test_marketplace_client.py Adds tests for hosted marketplace.json fetch digest capture and conditional headers.
tests/unit/marketplace/test_client_local.py Adds coverage for local direct-file manifest reads.
tests/unit/marketplace/test_lockfile_provenance.py Adds round-trip tests for new lockfile provenance fields and builder attachment behavior.
tests/unit/marketplace/test_marketplace_install_integration.py Verifies provenance propagation from resolver into install validation outcomes.
tests/unit/marketplace/test_marketplace_archive.py Adds safety tests for zip/tar extraction helpers.
docs/src/content/docs/reference/cli/marketplace.md Updates CLI reference to document new accepted source shapes and #ref behavior.
docs/src/content/docs/consumer/installing-from-marketplaces.md Documents new marketplace source shapes and hosted JSON lockfile provenance behavior.
packages/apm-guide/.apm/skills/apm-usage/commands.md Updates the guide skill command reference for the expanded marketplace add SOURCE forms.
CHANGELOG.md Adds an Unreleased entry describing the new marketplace add source shapes and provenance behavior.

Copilot's findings

  • Files reviewed: 18/18 changed files
  • Comments generated: 5

Comment on lines +44 to +50
def _detect_archive_format(content_type: str, url: str) -> str:
"""Return ``tar.gz`` or ``zip`` from Content-Type or URL extension."""
media_type = content_type.lower().split(";", 1)[0].strip()
if media_type in {"application/gzip", "application/x-gzip", "application/x-tar"}:
return "tar.gz"
if media_type in {"application/zip", "application/x-zip-compressed"}:
return "zip"
Comment thread src/apm_cli/utils/archive.py Outdated
Comment on lines +143 to +149
final_url = getattr(response, "url", url)
if isinstance(final_url, str) and urlparse(final_url).scheme.lower() != "https":
raise ArchiveError(f"Redirect to non-HTTPS URL rejected: {final_url!r}")

Path(dest_dir).mkdir(parents=True, exist_ok=True)
archive_format = _detect_archive_format(response.headers.get("Content-Type", ""), url)
if archive_format == "tar.gz":
return source.lower().startswith("https://") and kind in {"github", "gitlab", "git"}


def _local_source_points_to_file(source) -> bool:
Comment on lines +100 to +104
For generic-git marketplaces, `marketplace.json` is fetched via a
sparse-cone clone (only the manifest path is downloaded); APM does
not forward `GITHUB_APM_PAT` or `GITLAB_APM_TOKEN` to non-GitHub /
non-GitLab hosts. Authentication falls through to the host's
`*_APM_PAT` (e.g. `ADO_APM_PAT`) or your local
`git credential-manager`. See
non-GitLab hosts. Authentication falls through to the host's local git
credential helper when one is configured. See
Comment thread CHANGELOG.md Outdated
Comment on lines +20 to +22
- `apm marketplace add` now accepts Anthropic-compatible git URL `#ref`, local
file, and hosted `marketplace.json` sources, recording hosted JSON provenance
in the lockfile. (closes #676)
danielmeppiel and others added 2 commits June 11, 2026 00:44
Relocate the generic download+extract helper from
src/apm_cli/marketplace/archive.py to src/apm_cli/utils/archive.py so a
future URL-package installer can reuse it without importing marketplace
internals. The module was already marketplace-free; this makes the
decoupling structural. Zip-slip / path-traversal / decompression-bomb
guards stay in the helper so they apply to every caller.

Honors #692 forward-compat constraint 3 (standalone archive extraction).
No production caller imported the helper yet; the only importer (the
safety test) moves to tests/unit/utils/test_archive.py.

Refs #692

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Folds copilot-pull-request-reviewer follow-ups: x-tar format detection, post-redirect format detection, type hint, GITLAB_APM_PAT doc fix, changelog PR number.

Refs #676

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel

Copy link
Copy Markdown
Collaborator Author

APM Review Panel: ship_with_followups

Phase 1 source parity teaches apm marketplace add the three Anthropic source shapes (git URL #ref, local path/file, hosted marketplace.json) with HTTPS-only fetch, path containment, and digest provenance -- no blocking findings.

cc @danielmeppiel @sergio-sisternes-epam -- a fresh advisory pass is ready for your review.

This PR lands a clean, well-fenced slice of #676. The shape discriminator routes each Anthropic source into existing manifest parsing without forking the resolver; hosted JSON fetch is HTTPS-only with a redirect-to-non-HTTPS guard, a 10MB cap, ETag/Last-Modified conditional refresh, and sha256 provenance recorded in the lockfile. The salvaged utils/archive.py carries zip-slip / traversal / symlink / decompression-bomb guards and lives decoupled from the marketplace layer, exactly as #692 requires. The panel converges on ship: the supply-chain lens (weighted heaviest here) finds the live fetch path safe, and the auth lens confirms the Phase 2 fence holds -- no private-endpoint credentials are wired. The recommended follow-ups are all forward-looking (they harden archive.py for when #692 actually wires it into install execution) and none gate this merge.

All six #692 forward-compat constraints hold: (1) kind is a derived validated string, not a closed enum, and persistence stores url/ref/path not kind; (2) _fetch_url_direct is a generic HTTPS GET returning raw bytes + digest + validators with no index-format assumptions; (3) archive.py is standalone under utils/ with the safety guards and no marketplace import; (4) lockfile uses neutral source_url / source_digest with sha256:<hex>; (5) fetch and parse are decoupled so a format dispatcher inserts cleanly; (6) resolution provenance is centralized so a future resolve_url_package() is a parallel path.

Aligned with: Secure by default -- HTTPS-only fetch, redirect guard, size caps, path containment, and archive zip-slip/bomb guards fail closed. Portable by manifest -- hosted JSON provenance (source_url + source_digest) makes installs reproducible and auditable. Pragmatic as npm -- matches Anthropic's marketplace.json shape exactly so the mental model transfers.

Growth signal. Exact parity with Anthropic's marketplace.json source shapes removes a concrete adoption blocker for teams migrating catalogs, and crediting @Vicente-Pastor's #691 plumbing models the contributor-funnel behavior worth amplifying.

Panel summary

Persona B R N Takeaway
Python Architect 0 1 1 Clean additive design; two separate remote-URL detection heuristics could converge.
CLI Logging Expert 0 0 1 Warnings are ASCII, symbol-tagged, and actionable.
DevX UX Expert 0 0 1 #ref parity + unpinned-ref nudge match the Anthropic mental model.
Supply Chain Security 0 2 1 Live fetch path is safe; archive bomb-guard trusts declared sizes (not yet wired).
OSS Growth Hacker 0 0 1 Anthropic parity is an adoption story; contributor credit is healthy.
Test Coverage Expert 0 1 1 3 source shapes + provenance well covered; size-cap guard untested.
Auth Expert 0 0 1 #692 fence confirmed: no private-endpoint auth wired in Phase 1.
Doc Writer 0 0 1 CHANGELOG + consumer + reference + guide skill all updated consistently.
Performance Expert 0 0 1 ETag/304 conditional refresh is a sensible bandwidth win.

B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.

Top 4 follow-ups

  1. [Supply Chain Security] When archive.py is wired into install execution under [FEATURE] URL-based marketplace work Extension #692, enforce actual-bytes accounting (and prefer streaming) rather than trusting member.size / info.file_size -- a malicious archive can under-declare its header sizes and slip past the decompression-bomb cap. Not blocking now because the helper is salvaged plumbing not yet on any execution path.
  2. [Test Coverage Expert] Add a decompression-bomb test that drives _extract_zip / _extract_tar_gz past _MAX_UNCOMPRESSED_BYTES; the traversal, symlink, format, and redirect guards are tested but the size cap is the one guard with no assertion.
  3. [Python Architect] Fold the CLI _is_remote_marketplace_json_url (path endswith /marketplace.json) and MarketplaceSource.is_remote_manifest_url (path == "") into one shared classifier so the "is this a hosted JSON URL" decision has a single source of truth.
  4. [Supply Chain Security] _fetch_url_direct already accepts expected_digest but fetch_marketplace never passes it; threading a stored digest through refresh would give TOFU pinning -- defer to [FEATURE] URL-based marketplace work Extension #692 since Phase 1 is provenance-only.

Architecture

classDiagram
    class MarketplaceSource {
        +str name
        +str url
        +str ref
        +str path
        +kind() str
        +is_remote_manifest_url() bool
        +display_source() str
    }
    class FetchResult {
        +dict data
        +str digest
        +str etag
        +str last_modified
    }
    class MarketplaceManifest {
        +str source_url
        +str source_digest
        +find_plugin() MarketplacePlugin
    }
    class MarketplacePluginResolution {
        +str source_url
        +str source_digest
        +provenance(name, plugin) dict
    }
    class LockedDependency {
        +str source_url
        +str source_digest
        +to_dict() dict
        +from_dict() LockedDependency
    }
    MarketplaceSource --> FetchResult : url kind fetch
    FetchResult --> MarketplaceManifest : parse_marketplace_json
    MarketplaceManifest --> MarketplacePluginResolution : resolve
    MarketplacePluginResolution --> LockedDependency : provenance()
    class MarketplaceSource:::changed
    class FetchResult:::added
    class MarketplacePluginResolution:::changed
    class LockedDependency:::changed
Loading
flowchart TD
    A[apm marketplace add SOURCE] --> B[_split_source_fragment_ref]
    B --> C{shape discriminator}
    C -->|https git url| D[git probe + GitCache]
    C -->|local dir or file| E[_fetch_local / _fetch_local_file]
    C -->|hosted marketplace.json| F[_fetch_url_direct HTTPS-only]
    F --> G[size cap + digest + redirect guard]
    E --> H[ensure_path_within containment]
    G --> I[parse_marketplace_json]
    H --> I
    D --> I
    I --> J[MarketplaceManifest source_url/source_digest]
    J --> K[MarketplacePluginResolution.provenance]
    K --> L[apm.lock.yaml source_url/source_digest]
Loading
sequenceDiagram
    participant CLI as marketplace add
    participant C as client.fetch_marketplace
    participant N as _fetch_url_direct
    participant Cache as sidecar cache
    CLI->>C: url-kind MarketplaceSource
    C->>Cache: read stale meta (etag/last-modified)
    C->>N: GET with conditional headers
    alt 304 Not Modified
        N-->>C: None
        C->>Cache: serve stale + rewrite meta
    else 200 OK
        N-->>C: FetchResult(data, sha256 digest)
        C->>Cache: write data + digest + validators
    end
    C-->>CLI: MarketplaceManifest(source_url, source_digest)
Loading

Recommendation

Ship with the four follow-ups tracked. Nothing blocks merge: CI is green, the live hosted-JSON fetch path is HTTPS-only and bounded, and every #692 forward-compat constraint holds. The single highest-signal item to carry into #692 is follow-up 1 -- harden archive.py's decompression-bomb accounting to use actual extracted bytes before that helper is wired onto a real install path.


Full per-persona findings

Python Architect

  • [recommended] Two independent heuristics decide "is this a hosted marketplace.json URL": _is_remote_marketplace_json_url in commands/marketplace/__init__.py checks path.endswith("/marketplace.json"), while MarketplaceSource.is_remote_manifest_url in marketplace/models.py checks path == "" and https. They agree today because the CLI sets path="", but the divergent predicates are a latent drift hazard.
    Suggested: extract one shared classifier (or have the CLI defer entirely to the model property) so the url-kind decision has a single definition.
  • [nit] MarketplacePluginResolution.provenance() centralizing the install-provenance payload is the right call -- it removes the inline dict from commands/install.py and gives [FEATURE] URL-based marketplace work Extension #692 a single seam to extend.

CLI Logging Expert

  • [nit] Output is consistent with the house style: logger.warning(..., symbol="warning") for the unpinned-ref nudge, logger.start(..., symbol="gear") before the slow probe, and verbose_detail now reads fetch_source.kind so url-kind sources report correctly. All ASCII, no Unicode.

DevX UX Expert

  • [nit] The #ref fragment parsing plus the actionable unpinned-ref warning (Pin this git marketplace ... #v1.0.0) match the Anthropic mental model and steer users toward reproducible pins without hard-failing. --ref + #ref conflict is a clean hard error rather than a silent pick.

Supply Chain Security Expert

  • [recommended] archive.py decompression-bomb accounting sums member.size (tar) / info.file_size (zip), both header-declared values, and then does dst.write(src.read()) reading the whole member into memory. A crafted archive can under-declare sizes to pass the _MAX_UNCOMPRESSED_BYTES check. Not blocking: the helper is salvaged Feature/676 url based marketplace #691 plumbing and is not wired into any Phase 1 execution path.
    Suggested: when [FEATURE] URL-based marketplace work Extension #692 wires extraction in, account actual bytes read (chunked) and abort mid-stream past the cap.
  • [recommended] _fetch_url_direct accepts expected_digest and verifies it, but fetch_marketplace never supplies one, so refreshes are not TOFU-pinned to a previously seen digest. Acceptable for a provenance-only Phase 1; pinning belongs to [FEATURE] URL-based marketplace work Extension #692.
  • [nit] The live hosted-JSON path is solid: HTTPS-only scheme check, post-redirect non-HTTPS rejection, Content-Length and actual-length 10MB caps, sha256:<hex> digest, JSON-root-is-object validation, and ensure_path_within containment on the local-file read. Fails closed at each gate.

OSS Growth Hacker

  • [nit] Exact marketplace.json parity with Anthropic is a migration-unblocking adoption story worth surfacing in release notes, and the explicit @Vicente-Pastor credit for the Feature/676 url based marketplace #691 foundation is the kind of contributor-funnel signal that compounds.

Test Coverage Expert

  • [recommended] The _MAX_UNCOMPRESSED_BYTES guard in archive.py has no test. test_archive.py covers path traversal, symlink rejection, uncompressed-tar rejection, redirect-URL format detection, and safe extraction -- but nothing drives a member past the size cap.
    Suggested: add test_extract_zip_rejects_decompression_bomb asserting ArchiveError past the limit.
  • [nit] The three board-mandated source shapes each have a direct test (test_add_git_url_fragment_sets_ref_and_canonical_url, test_add_local_marketplace_json_file, test_add_remote_marketplace_json_url_fetches_directly) plus client-level digest/ETag/conditional-header coverage and lockfile provenance round-trip. Scenario coverage matches the brief.

Auth Expert

  • [nit] [FEATURE] URL-based marketplace work Extension #692 fence confirmed to hold: _fetch_url_direct sends only {"User-Agent": "apm-cli"} -- no token, no AuthResolver. The url-kind branch in fetch_marketplace does not pass auth_resolver onward, and docs explicitly state hosted JSON URLs are public HTTPS only this phase. No private-endpoint credential wiring leaked into Phase 1.

Doc Writer

  • [nit] Documentation is consistent and drift-free: CHANGELOG [Unreleased] entry, consumer/installing-from-marketplaces.md, reference/cli/marketplace.md (flag table + trust-boundary note), and the apm-usage/commands.md guide skill all describe the new shapes and the public-only Phase 1 fence.

Performance Expert

  • [nit] ETag / Last-Modified conditional refresh with a 304 stale-serve path avoids re-downloading unchanged catalogs, and the url-kind cache key (url__<sha256[:16]>) cleanly namespaces hosted sources. Sensible bandwidth win with no hot-path regression.

This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.

danielmeppiel and others added 2 commits June 11, 2026 09:54
The decompression-bomb guard summed header-declared member sizes
(member.size / info.file_size) and then buffered each member whole
via dst.write(src.read()). A crafted archive can under-declare its
header sizes to slip past a header-based cap, and a single huge
member was read into memory regardless. Stream each member in
chunks, count the ACTUAL bytes read, and abort mid-stream once the
cumulative total exceeds the cap. Adds zip + tar.gz regression
tests (mutation-break verified) exercising the byte-counting limit.

Addresses apm-review-panel follow-ups 1 (Supply Chain) and 2
(Test Coverage) on #1739.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two predicates decided "is this a hosted marketplace.json URL":
the CLI _is_remote_marketplace_json_url checked path endswith
/marketplace.json, while MarketplaceSource.is_remote_manifest_url
checked path == "" plus https. They agreed only because the CLI
sets path="", a latent drift hazard. Both now delegate to one
shared url_names_remote_manifest() in marketplace.models (HTTPS +
host + path endswith /marketplace.json). The source-kind
discriminator stays a derived, validated string (#692 constraint
1), not a closed enum. Also corrects a stray GITLAB_APM_TOKEN code
comment to GITLAB_APM_PAT.

Addresses apm-review-panel follow-up 3 (Python Architect) and the
Copilot env-var-name signal on #1739.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel

Copy link
Copy Markdown
Collaborator Author

Shepherd-driver follow-up pass: ready-to-merge (advisory)

A fold-in convergence pass folded the apm-review-panel ship_with_followups
follow-ups that fall inside this PR's stated scope (Phase 1 marketplace-add
source parity, #676). The prior full-panel advisory comment stands; this is the
shepherd's terminal fold record, not a second panel run.

Folded in this run

  • (panel) Supply Chain FU-1 + FU-2: utils/archive.py decompression-bomb guard
    now counts ACTUAL extracted bytes via a chunked streaming copy
    (_copy_member_within_limit) instead of trusting header-declared
    member.size / info.file_size, and aborts mid-stream past the cap so a
    single huge member is never buffered whole. Added zip + tar.gz size-cap
    regression tests, mutation-break verified -- resolved in 50120dc0.
  • (panel) Python Architect FU-3: the two hosted-marketplace.json URL
    predicates (_is_remote_marketplace_json_url and
    MarketplaceSource.is_remote_manifest_url) now delegate to one shared
    url_names_remote_manifest() classifier in marketplace/models.py, removing
    the latent drift. The source-kind discriminator stays a derived validated
    string ([FEATURE] URL-based marketplace work Extension #692 constraint 1), not a closed enum -- resolved in a92e7235.

Copilot signals reviewed

The five Copilot inline comments were all addressed in this PR:

  • src/apm_cli/utils/archive.py (x-tar handling) -- LEGIT: resolved before this
    run; uncompressed .tar is now explicitly rejected with a clear message.
  • src/apm_cli/utils/archive.py (detect format from redirect target) -- LEGIT:
    resolved before this run; _detect_archive_format uses the final (post-redirect) URL.
  • src/apm_cli/commands/marketplace/__init__.py (_local_source_points_to_file
    type hint) -- LEGIT: resolved before this run; helper is typed.
  • docs/.../installing-from-marketplaces.md (GITLAB_APM_TOKEN) -- LEGIT:
    doc resolved before this run; the matching stray code comment is corrected to
    GITLAB_APM_PAT in a92e7235.
  • CHANGELOG.md (PR number) -- LEGIT: resolved before this run; entry ends (#1739).

Deferred (out-of-scope follow-up)

Regression-trap evidence (mutation-break gate)

  • test_extract_zip_rejects_decompression_bomb and
    test_extract_tar_gz_rejects_decompression_bomb -- deleted the
    _MAX_UNCOMPRESSED_BYTES running-total raise in _copy_member_within_limit;
    both tests FAILED as expected (DID NOT RAISE); guard restored.

#692 forward-compat constraints

All six still hold after this pass: (1) source-kind discriminator is a derived
validated string, not a closed enum; (2) _fetch_url_direct stays a generic
HTTPS GET, untouched; (3) archive guards live in standalone utils/archive.py,
decoupled from the marketplace layer; (4) lockfile provenance keeps neutral
source_url / source_digest with sha256:<hex>; (5) fetch/parse stay
decoupled so a format dispatcher inserts cleanly; (6) resolution provenance
stays centralized for a future parallel resolve_url_package().

Lint

uv run --extra dev ruff check src/ tests/ and ruff format --check src/ tests/
both silent; pylint R0801 10.00/10; auth-signal lint clean.

CI

Green on head a92e7235: 14 checks SUCCESS, 1 SKIPPED (deploy), 0 CI fix
iterations needed. gh pr view reports mergeable=MERGEABLE,
mergeStateStatus=BLOCKED (awaiting required maintainer review, not a CI gate).

PR head SHA CEO stance iters folds defers Copilot rounds CI mergeable mergeStateStatus notes
#1739 a92e723 ship_with_followups 1 4 1 1 green MERGEABLE BLOCKED awaiting maintainer review

Ready for maintainer review. No auto-merge.

@danielmeppiel

Copy link
Copy Markdown
Collaborator Author

APM Review Panel: ship_with_followups

Marketplace source parity lands: apm marketplace add now accepts HTTPS git URLs, local paths, and hosted marketplace.json catalogs with provenance tracking.

cc @sergio-sisternes-epam -- a fresh advisory pass is ready for your review.

All eight panelists converge on the same signal: this PR is well-scoped, correctly defers private-endpoint auth to Phase 2 (#692), and ships solid defensive guards (HTTPS-only, traversal prevention, bomb limits, ETag-based conditional fetch). The strategic value is clear: marketplace add becoming source-agnostic removes one of the last ergonomic gaps that made APM look single-host-only.

The substantive gaps cluster around one theme: the archive download helper buffers the full compressed payload into RAM before any guard fires. Python-architect, supply-chain-security, and performance-expert all flagged the same response.content pattern in archive.py L163. This is not exploitable in the current add-source flow (user-configured HTTPS endpoints, no install hot path yet), but the code is being extracted as a reusable utils/archive helper, so the lax memory posture will propagate. A streaming-read refactor or at minimum a TODO linking to #692 is appropriate before this helper sees wider use.

The doc-writer's lockfile-spec gap (source_url / source_digest undocumented in the canonical field table) and the supply-chain expert's missing HTTPS-rejection test are the highest-signal follow-ups because they affect durable contracts: the spec page is the authoritative reference, and the HTTPS guard is a secure-by-default promise with no regression trap today.

Dissent. The test-coverage-expert panelist returned malformed schema after two respawns. Its substance (missing tests for fragment_ref, --ref error, and archive non-HTTPS rejection) overlaps with supply-chain-security's evidence-backed finding on the HTTPS test gap. I weight the supply-chain finding as authoritative and do not discount the gap because the coverage panelist failed to deliver valid JSON.

Aligned with: secure_by_default: HTTPS-only enforcement, redirect-scheme validation, traversal and symlink guards, and decompression-bomb limits are all present. One regression-trap gap: no unit test asserts the HTTPS-only archive boundary. portable_by_manifest: source_url and source_digest in the lockfile entry capture provenance, making marketplace sources reproducible across environments. The lockfile spec doc needs the new fields added. pragmatic_as_npm: Accept any URL shape a user already has: git URL with #ref, local path, hosted JSON, without ceremony. This is the npm add-anything ergonomic bar.

Growth signal. Marketplace source parity is a strong release story: "if you already have a marketplace.json hosted anywhere, apm marketplace add just works." The CHANGELOG entry should lead with that user benefit rather than protocol taxonomy, per the growth-hacker's suggestion.

Panel summary

Persona B R N Takeaway
Python Architect 0 0 3 Clean new archive helper and marketplace URL fetcher; three minor architectural nits on memory cap, containment semantics, and streaming.
CLI Logging Expert 0 0 2 No blocking CLI output concerns; two nits on verbose label naming and warning ordering.
DevX UX Expert 0 0 3 Hosted JSON URL support is ergonomic; three minor UX gaps around alias defaults, silent flag ignoring, and docs applicability.
Supply Chain Security Expert 0 2 1 HTTPS enforcement, traversal guards, and bomb limits are solid; missing test for archive HTTP rejection and unbounded response.content read in archive.py are the gaps.
OSS Growth Hacker 0 0 2 Marketplace source parity is a strong growth beat; CHANGELOG benefit framing and docs phase wording can be tightened.
Auth Expert 0 0 0 No auth surface touched. _fetch_url_direct sends only User-Agent over mandatory HTTPS with redirect-scheme validation. AuthResolver, tokens, and credential helpers are not in scope for this path.
Doc Writer 0 2 1 Docs are accurate and well-scoped; two drift gaps: lockfile spec omits new source_url/source_digest fields, and generic-git auth text drops the still-live ADO_APM_PAT path.
Test Coverage Expert 0 0 0 Schema failure -- see extras.
Performance Expert 0 1 2 ETag/304 conditional fetch is well-implemented; archive helper buffers full payload in RAM instead of streaming -- flag before hot-path adoption.

B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.

Top 4 follow-ups

  1. [Supply Chain Security Expert] Add unit test asserting download_and_extract_archive rejects non-HTTPS URLs -- The HTTPS-only guard at archive.py L158 is a secure-by-default promise with no regression trap. A one-line pytest.raises test closes the gap.
  2. [Doc Writer] Document source_url and source_digest in lockfile-spec.md per-entry field table -- The lockfile spec is the authoritative reference and the marketplace docs link readers there. Missing fields create silent drift on a portable-by-manifest contract.
  3. [Python Architect] Stream archive download to disk or add TODO for [FEATURE] URL-based marketplace work Extension #692 before utils/archive becomes a hot-path helper -- Three panelists flagged the same response.content double-buffer pattern. Not exploitable today but the helper is being extracted for reuse; fix before it propagates.
  4. [Doc Writer] Restore *_APM_PAT fallthrough wording in cli/marketplace.md generic-git auth text -- Edited docs dropped ADO_APM_PAT from the generic section while the ADO section still references it, creating an internal contradiction on a live auth mechanism.

Architecture

classDiagram
    direction LR
    class MarketplaceSource {
      <<ValueObject>>
      +name str
      +owner str
      +repo str
      +host str
      +kind str
      +url str
      +ref str
    }
    class FetchResult {
      <<ValueObject>>
      +data dict
      +digest str
      +etag str
      +last_modified str
    }
    class MarketplaceManifest {
      <<ValueObject>>
      +plugins list
    }
    class ArchiveError {
      <<Exception>>
    }
    class MarketplaceFetchError {
      <<Exception>>
    }
    class AuthResolver {
      <<Strategy>>
      +try_with_fallback(host, fn)
      +classify_host(host) HostInfo
    }
    class fetch_marketplace {
      <<Facade>>
    }
    class download_and_extract_archive {
      <<Facade>>
    }
    fetch_marketplace ..> MarketplaceSource : reads
    fetch_marketplace ..> FetchResult : uses
    fetch_marketplace ..> MarketplaceManifest : returns
    fetch_marketplace ..> AuthResolver : delegates auth for git paths
    download_and_extract_archive ..> ArchiveError : raises
    note for FetchResult "Frozen dataclass value object"
    note for fetch_marketplace "Fetcher dispatch keyed by source.kind"
    class FetchResult:::touched
    class download_and_extract_archive:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading
flowchart TD
    A["CLI: apm marketplace add"] --> B{"source.kind?"}
    B -->|url| C["_fetch_url_direct - HTTPS GET with ETag and Last-Modified"]
    B -->|github/gitlab| D["_fetch_file - API plus auth fallback"]
    B -->|local| E{"path is file?"}
    B -->|git| F["git-backed marketplace.json fetch"]
    E -->|yes| G["_fetch_local_file - direct read plus containment guard"]
    E -->|no| H["_fetch_local_direct_read - resolve marketplace.json candidates"]
    C --> I{"HTTP 304?"}
    I -->|yes| J["Serve stale cache"]
    I -->|no| K["Validate digest and size cap"]
    K --> L["_write_cache - data and meta"]
    L --> M["parse_marketplace_json"]
    G --> M
    H --> M
    D --> M
    M --> P["MarketplaceManifest source_url/source_digest"]
    subgraph Archive["utils/archive.py"]
      Q["download_and_extract_archive"] --> R{"detect format"}
      R -->|tar.gz| S["_extract_tar_gz guarded"]
      R -->|zip| T["_extract_zip guarded"]
    end
Loading

Recommendation

The PR is well-architected, correctly scoped, and all panelists agree on direction. The two highest-signal follow-ups (HTTPS-rejection test and lockfile-spec doc update) are small, bounded tasks that can land as a fast follow or be folded into this PR at the author's discretion. Neither represents a user-facing regression today. Ship and track the four follow-ups above.


Full per-persona findings

Python Architect

  • [nit] response.content buffers entire archive into memory before streaming extraction at src/apm_cli/utils/archive.py:163
    download_and_extract_archive calls response.content which materializes the full body into RAM, then wraps it in io.BytesIO for streaming extraction. For large archives near the 512 MB cap this doubles peak memory. Using iter_content with a tempfile or requests stream=True piped directly into the tarfile/zipfile would bound RSS. Acceptable at current 512 MB ceiling but worth a TODO comment for [FEATURE] URL-based marketplace work Extension #692 forward-compat where archives grow.
    Suggested: Add a # TODO([FEATURE] URL-based marketplace work Extension #692): stream response to disk to avoid 2x peak RSS for large archives comment, or switch to stream=True plus SpooledTemporaryFile now.
  • [nit] _fetch_local_file containment check is a tautology at src/apm_cli/marketplace/client.py:595
    ensure_path_within(manifest_file, manifest_file.parent) always passes for any file that exists because a file is always within its own parent directory. The guard was likely intended to validate that the user-supplied path does not escape the project root or a configured base directory, not just its own parent. As written it cannot catch path escapes.
    Suggested: Pass the project root or APM workspace dir as the containment boundary instead of manifest_file.parent, or document why parent is the intended single-file trust boundary.
  • [nit] _MAX_MARKETPLACE_JSON_BYTES cap not enforced until after response.content allocation at src/apm_cli/marketplace/client.py:258
    _fetch_url_direct reads resp.content (full body in RAM) before comparing len(raw) > _MAX_MARKETPLACE_JSON_BYTES. An adversarial server ignoring Content-Length can force allocation before the check fires. Bounded at 10 MB when Content-Length is honest, but streaming would make the guard deterministic.
    Suggested: Use stream=True and read bounded chunks up to _MAX_MARKETPLACE_JSON_BYTES before decoding.

CLI Logging Expert

  • [nit] Raw 'Kind: url' verbose label is opaque to humans
    The verbose output shows 'Kind: url' which is an internal enum value leak. CLI output should use user-facing language like 'Source: remote URL'.
    Suggested: Map internal kind enum to user-facing labels before rendering, e.g. url -> remote URL, git -> git repo, local -> local path.
  • [nit] Unpinned git URL warning ordering reads slightly odd
    The warning about unpinned git URLs appears around the action start rather than before work begins or in the final caveat section. This is cosmetic; the message is actionable and ASCII-safe.
    Suggested: Emit the warning before logger.start or collect it for the post-resolution summary.

DevX UX Expert

  • [nit] Hostname-only default alias may collide when multiple hosted JSON URLs share the same host
    If a user registers both https://example.com/foo/marketplace.json and https://example.com/bar/marketplace.json, the default alias derived from hostname alone would collide. This only affects multi-registry power users but is surprising.
    Suggested: Derive the default alias from hostname plus first meaningful path segment, or ask users to choose when a collision is detected.
  • [nit] --host flag is silently ignored for hosted JSON URLs without warning
    When a user passes --host to a hosted JSON URL, the URL already carries the host, so the flag has no effect. Silent flag ignoring makes diagnosis harder.
    Suggested: Emit a warning when --host is provided but the resolved source is a hosted marketplace.json URL.
  • [nit] Docs do not clarify that --ref applies to git sources only
    The --ref flag is meaningful for git-backed sources but not for a static hosted JSON file. Flag applicability should be explicit to avoid user confusion.
    Suggested: Add '(git sources only)' to the --ref docs/help text where practical.

Supply Chain Security Expert

  • [recommended] download_and_extract_archive buffers entire response.content without Content-Length pre-check or streaming cap at src/apm_cli/utils/archive.py:163
    Unlike _fetch_url_direct in client.py which pre-checks Content-Length before accessing resp.content, archive.py calls response.content unconditionally. A malicious server can stream gigabytes into RAM before the decompression-bomb guard fires. The _MAX_UNCOMPRESSED_BYTES guard only applies post-decompress; the raw compressed payload has no cap. This is not blocking because it requires a malicious HTTPS server the user explicitly configured, but it weakens the fail-closed posture.
    Suggested: Add a Content-Length pre-check and/or use iter_content with a running byte counter before buffering into memory. Cap at a reasonable compressed-payload limit.
  • [recommended] No test asserts download_and_extract_archive rejects non-HTTPS URLs at tests/unit/utils/test_archive.py
    The HTTPS-only guard at archive.py L158 is a critical security boundary preventing MITM archive injection but has no dedicated unit test. The traversal, symlink, and bomb guards all have tests. This is a regression-trap gap on a secure-by-default promise.
    Suggested: Add test_download_rejects_http_url with pytest.raises(ArchiveError, match='HTTPS') for an http:// archive URL.
    Proof (missing at): `` -- proves: Archive download rejects non-HTTPS URLs to prevent MITM archive injection [secure-by-default]
  • [nit] _fetch_local_file containment check needs intent documented or broader boundary at src/apm_cli/marketplace/client.py:597
    ensure_path_within(manifest_file, manifest_file.parent) will pass for any resolved Path because a file is within its own parent directory. This may still document symlink-resolution intent, but the trust boundary is not obvious.

OSS Growth Hacker

  • [nit] CHANGELOG entry leads with implementation detail instead of user benefit at CHANGELOG.md
    Users scanning CHANGELOG decide whether to upgrade based on value framing, not protocol taxonomy. Lead with add any marketplace by URL, local path, or hosted JSON catalog.
    Suggested: Rewrite lead sentence around user outcome: apm marketplace add now works with HTTPS git repos, local paths, and hosted marketplace.json catalogs.
  • [nit] Docs expose internal Phase 1 roadmap language to end users at docs/src/content/docs/consumer/installing-from-marketplaces.md
    Visitors do not know what Phase 1 means or how many phases exist. State the current public/local scope as a product fact and point to the tracked follow-up.
    Suggested: Replace phase language with: Hosted marketplace.json URLs currently use public HTTPS without custom auth headers; private URL authentication is tracked separately.

Auth Expert

No findings.

Doc Writer

  • [recommended] Lockfile spec reference does not document the new source_url / source_digest per-entry fields the PR adds. at docs/src/content/docs/reference/lockfile-spec.md:143
    lockfile-spec.md is the authoritative per-entry field table. The PR adds source_url and source_digest to LockedDependency, serializes them in to_dict/from_dict, and the marketplace docs point readers to the lockfile reference. The canonical reference never defines the new fields.
    Suggested: Add two rows to the per-entry table after marketplace_plugin_name for source_url and source_digest.
  • [recommended] Generic-git auth fallthrough text omits ADO_APM_PAT / *_APM_PAT. at docs/src/content/docs/reference/cli/marketplace.md:122
    Edited docs replaced the host *_APM_PAT path with local git credential helper only, but ADO_APM_PAT is still a live mechanism and the same page's ADO section mentions it. The generic statement is incomplete and conflicts with the ADO guidance.
    Suggested: Restore host PAT phrasing, e.g. authentication falls through to the host's *_APM_PAT (such as ADO_APM_PAT) or local git credential helper. Mirror the wording in installing-from-marketplaces.md.
  • [nit] Intro frames the feature as parity with another product rather than APM's own capability. at docs/src/content/docs/consumer/installing-from-marketplaces.md:67
    User-facing docs should be self-contained and durable. Comparative parity framing is okay in PR/release notes but less useful in task docs.
    Suggested: Rephrase to state APM accepts several source shapes beyond GitHub repos.

Test Coverage Expert

No findings.

Performance Expert

  • [recommended] Archive download buffers entire response in memory before extraction, risking high peak RSS. at src/apm_cli/utils/archive.py:157
    download_and_extract_archive calls response.content (full body in RAM), then wraps it for tarfile/zipfile. With _MAX_UNCOMPRESSED_BYTES at 512 MB, peak RSS can be high. The extraction guard is correct, but the initial download still materializes the compressed payload. Not blocking because this code path is not yet invoked on the install hot path.
    Suggested: Use a streaming or bounded download path before extraction; for zip, use a bounded spool-to-file strategy because zip requires random access.
  • [nit] No requests.Session reuse across marketplace fetches. at src/apm_cli/marketplace/client.py:243
    _fetch_url_direct uses bare requests.get, creating a new connection per source. For the single-source add flow this is negligible; bulk operations would benefit from connection reuse.
  • [nit] Content-Length guard reads header but still downloads full body on oversized response when header is absent or wrong. at src/apm_cli/marketplace/client.py:253
    _fetch_url_direct checks Content-Length, then resp.content. A streaming read with a running counter would abort early. Marginal for the 10 MB marketplace.json cap but good hygiene.

This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.

danielmeppiel and others added 4 commits June 11, 2026 13:56
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ketplace-source-parity

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

# Conflicts:
#	CHANGELOG.md
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danielmeppiel

Copy link
Copy Markdown
Collaborator Author

APM Review Panel: ship_now

Source parity is now folded to the quality bar for #676: the PR supports git #ref, local, and hosted marketplace.json sources with secure fetch boundaries, provenance, docs, and regression coverage.

cc @sergio-sisternes-epam -- a fresh advisory pass is ready for your review.

The follow-up pass found only small polish items, and those were folded: archive and hosted-JSON fetches stream with bounded reads, reject HTTPS downgrades, avoid cross-host cookie persistence, keep raw archive staging outside the destination tree, expose the new source forms in help text, and align docs on --host, --ref, public hosted JSON auth, lockfile provenance, and ADO_APM_PAT fallback. The branch has also been merged with current main, resolving the earlier conflict.

Aligned with: portable manifests through source_url / source_digest, secure defaults through HTTPS-only and redirect rejection, and pragmatic CLI UX through discoverable examples and scoped flag help.

Panel summary

Persona B R N Takeaway
Python Architect 0 0 0 Strategy dispatch and standalone archive utility are ready.
CLI Logging Expert 0 0 0 Help, warning, and verbose output now explain the new source shapes.
DevX UX Expert 0 0 0 Source-shape UX is discoverable and scoped to git/local/hosted JSON behavior.
Supply Chain Security Expert 0 0 0 HTTPS-only, redirect rejection, cookie clearing, digest, and staging safeguards are covered.
OSS Growth Hacker 0 0 0 Changelog and docs lead with the user value.
Auth Expert 0 0 0 URL-kind remains credential-free; git-backed auth fallthrough is documented.
Doc Writer 0 0 0 Reference and consumer docs match the code surface.
Test Coverage Expert 0 0 0 Regression tests cover streaming, redirects, cookie clearing, staging, aliasing, and provenance units.
Performance Expert 0 0 0 Streaming and session reuse are appropriate for Phase 1 volume.

Counts are signal strength, not gates. The maintainer ships.

Recommendation

Ship now. No in-scope foldable follow-ups remain for #676 Phase 1.


Full per-persona findings

Python Architect

No findings.

CLI Logging Expert

No findings.

DevX UX Expert

No findings.

Supply Chain Security Expert

No findings.

OSS Growth Hacker

No findings.

Auth Expert

No findings.

Doc Writer

No findings.

Test Coverage Expert

No findings.

Performance Expert

No findings.

This panel is advisory. Re-apply the panel-review label after new changes if another pass is needed.

@danielmeppiel danielmeppiel merged commit 82c8509 into main Jun 11, 2026
15 checks passed
@danielmeppiel danielmeppiel deleted the danielmeppiel/676-marketplace-source-parity branch June 11, 2026 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Marketplace add source parity with the Anthropic spec

2 participants