Skip to content

Backport smolvla grpo API hooks#3509

Draft
kufupa wants to merge 1 commit intohuggingface:mainfrom
kufupa:pr/port-smolvla-grpo-env-patch
Draft

Backport smolvla grpo API hooks#3509
kufupa wants to merge 1 commit intohuggingface:mainfrom
kufupa:pr/port-smolvla-grpo-env-patch

Conversation

@kufupa
Copy link
Copy Markdown

@kufupa kufupa commented May 5, 2026

Title

feat(policies): backport smolvla grpo api hooks

Summary / Motivation

This PR ports SmolVLA GRPO API support from a local, working lerobot runtime patch into the upstream huggingface/lerobot repository in a minimal, isolated change. It updates SmolVLAPolicy to expose distribution-parameter access for GRPO-style training paths and aligns call patterns with current policy module imports used in the runtime. This is needed to unblock GRPO experiments relying on probability-distribution parameters (_get_distr_params_chunk) without altering environment setup or external runtime behavior outside this policy surface.

Related issues

  • Fixes / Closes: # (if any)
  • Related: # (if any)

What changed

  • Backported src/lerobot/policies/smolvla/modeling_smolvla.py from the local patched environment runtime into this repo:
    • Added _get_distr_params_chunk for GRPO distribution-parameter path.
    • Added distr_queue queue bookkeeping setup in SmolVLAPolicy.
    • Updated select_action flow to handle the new model sampling tuple shape (actions, _ = model.sample_actions(...)).
    • Updated internal imports in modeling_smolvla.py to use the current in-repo policy module paths.
  • Added GRPO-specific API hooks required by the smolvla-grpo pipeline while keeping change scope limited to the policy implementation.

No breaking changes expected. No migration required beyond updating downstream GRPO callers to use the available _get_distr_params_chunk path when needed.

How was this tested (or how to run locally)

  • Validation attempted in this environment:
    • pre-commit run -a (failed: pre-commit not installed in this environment).
    • python -m pytest attempts were blocked by missing ML dependencies (torch not installed in this VM), so full local test run could not complete.
  • Repro instructions for reviewer:
    1. Clone fork and checkout branch:
      • git checkout pr/port-smolvla-grpo-env-patch
    2. Install project/deps per repository instructions.
    3. Run targeted tests for smolvla policy behavior:
      • python -m pytest -q tests/policies/smolvla/test_smolvla_rtc.py
    4. Run lint:
      • pre-commit install
      • pre-commit run -a

Checklist (required before merge)

  • Linting/formatting run (pre-commit run -a)
  • All tests pass locally (pytest)
  • Documentation updated
  • CI is green
  • Community Review: I have reviewed another contributor's open PR and linked it here: # (insert PR number/link)

Reviewer notes

  • Focus review on src/lerobot/policies/smolvla/modeling_smolvla.py, especially:
    • _get_distr_params_chunk
    • select_action return-shape handling
    • import path updates to policy modules
  • Confirm no environment runtime files are changed and this PR includes only intended policy API backport changes.
  • Diff scope is intentionally minimal: one file changed (src/lerobot/policies/smolvla/modeling_smolvla.py).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant