Backport smolvla grpo API hooks by kufupa · Pull Request #3509 · huggingface/lerobot

kufupa · 2026-05-05T10:14:59Z

Title

feat(policies): backport smolvla grpo api hooks

Summary / Motivation

This PR ports SmolVLA GRPO API support from a local, working lerobot runtime patch into the upstream huggingface/lerobot repository in a minimal, isolated change. It updates SmolVLAPolicy to expose distribution-parameter access for GRPO-style training paths and aligns call patterns with current policy module imports used in the runtime. This is needed to unblock GRPO experiments relying on probability-distribution parameters (_get_distr_params_chunk) without altering environment setup or external runtime behavior outside this policy surface.

Related issues

Fixes / Closes: # (if any)
Related: # (if any)

What changed

Backported src/lerobot/policies/smolvla/modeling_smolvla.py from the local patched environment runtime into this repo:
- Added _get_distr_params_chunk for GRPO distribution-parameter path.
- Added distr_queue queue bookkeeping setup in SmolVLAPolicy.
- Updated select_action flow to handle the new model sampling tuple shape (actions, _ = model.sample_actions(...)).
- Updated internal imports in modeling_smolvla.py to use the current in-repo policy module paths.
Added GRPO-specific API hooks required by the smolvla-grpo pipeline while keeping change scope limited to the policy implementation.

No breaking changes expected. No migration required beyond updating downstream GRPO callers to use the available _get_distr_params_chunk path when needed.

How was this tested (or how to run locally)

Validation attempted in this environment:
- pre-commit run -a (failed: pre-commit not installed in this environment).
- python -m pytest attempts were blocked by missing ML dependencies (torch not installed in this VM), so full local test run could not complete.
Repro instructions for reviewer:
1. Clone fork and checkout branch:
  - git checkout pr/port-smolvla-grpo-env-patch
2. Install project/deps per repository instructions.
3. Run targeted tests for smolvla policy behavior:
  - python -m pytest -q tests/policies/smolvla/test_smolvla_rtc.py
4. Run lint:
  - pre-commit install
  - pre-commit run -a

Checklist (required before merge)

Linting/formatting run (pre-commit run -a)
All tests pass locally (pytest)
Documentation updated
CI is green
Community Review: I have reviewed another contributor's open PR and linked it here: # (insert PR number/link)

Reviewer notes

Focus review on src/lerobot/policies/smolvla/modeling_smolvla.py, especially:
- _get_distr_params_chunk
- select_action return-shape handling
- import path updates to policy modules
Confirm no environment runtime files are changed and this PR includes only intended policy API backport changes.
Diff scope is intentionally minimal: one file changed (src/lerobot/policies/smolvla/modeling_smolvla.py).

feat(policies): backport smolvla grpo api hooks

490442c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backport smolvla grpo API hooks#3509

Backport smolvla grpo API hooks#3509
kufupa wants to merge 1 commit intohuggingface:mainfrom
kufupa:pr/port-smolvla-grpo-env-patch

kufupa commented May 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kufupa commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Title

Summary / Motivation

Related issues

What changed

How was this tested (or how to run locally)

Checklist (required before merge)

Reviewer notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kufupa commented May 5, 2026 •

edited

Loading