feat(rollout): decouple policy deployment from data recording with new `lerobot-rollout` CLI by imstevenpmwork · Pull Request #3413 · huggingface/lerobot

imstevenpmwork · 2026-04-19T22:48:26Z

Summary / Motivation

Introduces a clean architectural split between data collection (lerobot-record) and policy deployment (lerobot-rollout). Previously, lerobot-record handled teleoperation, policy inference, RTC async inference, and DAgger-style human-in-the-loop corrections in a single script with growing conditional complexity. This PR extracts all policy-related execution into a new lerobot.rollout module with a dedicated CLI, leaving lerobot-record as a focused teleoperation-only tool.
The new module follows a Strategy + Factory + Dependency Injection architecture: strategies are pluggable (base, sentry, highlight, dagger), inference backends are swappable (sync, rtc), and all shared state is assembled once into a RolloutContext bundle. This makes each deployment mode independently testable and extensible without touching the others.

Related issues

Related: N/A

What changed

New src/lerobot/rollout/ module — Complete policy deployment engine with:
- 4 strategies: BaseStrategy (autonomous, no recording), SentryStrategy (continuous recording + auto-upload), HighlightStrategy (ring buffer + keystroke save), DAggerStrategy (human-in-the-loop RaC with keyboard/pedal input)
- 2 inference engines: SyncInferenceEngine (inline) and RTCInferenceEngine (async background thread with action queue)
- RolloutContext dependency injection bundle with sub-contexts (runtime, hardware, policy, processor, dataset)
- ThreadSafeRobot wrapper for lock-protected concurrent hardware access
- RolloutRingBuffer memory-bounded circular buffer for highlight capture
- Polymorphic CLI dispatch via draccus ChoiceRegistry (--strategy.type=..., --inference.type=...)
New lerobot-rollout CLI (src/lerobot/scripts/lerobot_rollout.py) registered in pyproject.toml
Simplified lerobot-record — Stripped all policy/inference code; teleop is now required.
Extracted shared utilities:
- DatasetRecordConfig → src/lerobot/configs/dataset.py (shared between record and rollout)
- ActionInterpolator → src/lerobot/utils/action_interpolator.py (decoupled from RTC policy module)
- start_pedal_listener → src/lerobot/utils/pedal.py (evdev foot pedal support with graceful degradation)
Thread safety fix in policies/rtc/action_queue.py — added missing lock guards on qsize(), empty(), get_action_index()
Removed examples/hil/hil_data_collection.py (1184 lines), examples/hil/hil_utils.py, examples/rtc/eval_with_real_robot.py (673 lines) — replaced by lerobot-rollout
New examples — rollout.py and updated evaluate.py for lekiwi, phone-to-so100, and so100-EE setups
New docs — docs/source/inference.mdx covering all strategies and examples; updated hil_data_collection.mdx, il_robots.mdx, rtc.mdx

How was this tested (or how to run locally)

Full Tests CI run: https://github.com/huggingface/lerobot/actions/runs/25022252240
Minimal Tests added: tests/test_rollout.py — covers config validation, ring buffer, thread-safe robot, strategy/inference factory dispatch, DAgger state machine transitions, and context dataclass fields. Run with pytest -q tests/test_rollout.py -svv
Basic Manual HW checks, 2 cameras & OMX robot (mostly for testing the rollout strategies):
- lerobot-rollout --strategy.type=base
- lerobot-rollout --strategy.type=sentry
- lerobot-rollout --strategy.type=dagger with keyboard
- lerobot-rollout --strategy.type=highlight with keyboard
- lerobot-record pure teleoperation — no regressions
Deep E2E tests (same setup):
- I tested the behavior with a currently working workflow in main and in this branch. Results were successful as the behavior was the same between these two.
- Model: https://huggingface.co/imstevenpmwork/super_poulain_smolvla
- Dataset: https://huggingface.co/datasets/imstevenpmwork/super_poulain_draft
2nd Deep E3E test:
- Using the t-shirt folding setup and already working model, we verified there's no regression in the performance when using now lerobot-rollout. The robot folded the t-shirt successfully, proving the compatibility with old working models:
- Setup: https://huggingface.co/spaces/lerobot/robot-folding
- Model: https://huggingface.co/lerobot/folding_latest

Checklist (required before merge)

Linting/formatting run (pre-commit run -a)
All tests pass locally (pytest)
Documentation updated
CI is green
Community Review: I have reviewed another contributor's open PR and linked it here: #

Reviewer notes

RTC implementation has a lot of room for improvement. I intentionally tried to touch policies/rtc as least as possible. We can go back to improving it in the near-future.
Tests are minimal, but it is because we will refactor the tests/ strategy repository-wide in the future.
Anyone in the community is free to review the PR.
More development details in Feat/decouple record script #3380

HuggingFaceDocBuilderDev · 2026-04-19T22:50:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

imstevenpmwork · 2026-04-20T00:03:37Z

@claude Review this PR

claude · 2026-04-20T00:03:56Z

Claude finished @imstevenpmwork's task in 4m 1s —— View job

PR Review: `lerobot-rollout` CLI

I read all 54 changed files in detail. This is a well-structured PR that cleanly separates policy deployment from data recording. The Strategy + Factory + DI architecture is solid and the code is well-documented. Below is my full technical review.

Overall Assessment

Architecture: The decomposition into strategies (base/sentry/highlight/dagger), inference backends (sync/rtc), and context injection is clean and extensible. Each strategy is self-contained and the InferenceEngine ABC allows backend-agnostic action dispatch via send_next_action. Strong separation of concerns.

Code quality: Consistently documented, good logging, clean module boundaries. The codebase follows existing LeRobot patterns (draccus ChoiceRegistry, lazy imports, __all__ exports).

Bugs & Correctness Issues

1. DatasetRecordConfig.__post_init__ always appends timestamp — breaks --resume
src/lerobot/configs/dataset.py:74-77

def __post_init__(self) -> None:
    if self.repo_id:
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        self.repo_id = f"{self.repo_id}_{timestamp}"

When --resume=true, build_rollout_context calls LeRobotDataset.resume(cfg.dataset.repo_id) — but repo_id already has a fresh timestamp appended, so it will attempt to resume a dataset that doesn't exist. The timestamp should be conditional on not resuming, or this logic should be handled at the call site rather than in __post_init__.

Fix this →

2. DAgger strategy missing dataset validation in RolloutConfig.__post_init__
src/lerobot/rollout/configs.py:191-193

needs_dataset = isinstance(self.strategy, (SentryStrategyConfig, HighlightStrategyConfig))

DAggerStrategyConfig is not included in needs_dataset, but both _run_continuous and _run_corrections_only call dataset.add_frame() and dataset.save_episode() unconditionally. A missing --dataset.repo_id with DAgger will crash at runtime instead of failing fast at config validation.

Fix this →

3. DAgger correction phase incorrectly applies record_stride to human frames
src/lerobot/rollout/strategies/dagger.py:449 and 615

record_stride = max(1, cfg.interpolation_multiplier)
# ...
if record_tick % record_stride == 0:  # in CORRECTING phase
    dataset.add_frame(frame)
record_tick += 1

When interpolation_multiplier > 1, this skips every Nth human correction frame. But during corrections, the teleoperator provides frames at base FPS — there's no interpolation. This would discard human demonstration data. record_stride should only apply during the autonomous phase.

Fix this →

4. send_next_action silently produces partial action dicts
src/lerobot/rollout/strategies/core.py:269

action_dict = {k: interp[i].item() for i, k in enumerate(ordered_keys) if i < len(interp)}

If the interpolated tensor has fewer elements than ordered_action_keys, some keys are silently dropped. Sending a partial action dict to the robot could cause unpredictable hardware behavior. This should be a hard error, or at minimum a warning.

5. Highlight strategy has no episode lock on background push
src/lerobot/rollout/strategies/highlight.py:257-277 — _background_push calls safe_push_to_hub without _episode_lock, unlike Sentry (sentry.py:211-222) and DAgger (dagger.py:719-730) which both hold the lock. If the main loop calls dataset.save_episode() concurrently with a push, there could be a data race on the dataset's on-disk state.

Design Concerns

6. Global default constants changed — affects all users
src/lerobot/datasets/utils.py:74-75

DEFAULT_DATA_FILE_SIZE_IN_MB = 50   # was 100
DEFAULT_VIDEO_FILE_SIZE_IN_MB = 100  # was 200

These constants are used throughout the dataset system, not just by rollout. Halving the chunk sizes changes episode rotation behavior for existing pipelines, training scripts, and any code that depends on these defaults. This should be called out in the PR description and potentially discussed separately.

7. PEFT-via-record test removed with no rollout equivalent
tests/test_cli_peft.py — 82 lines deleted (test_peft_record_loads_policy). This verified that PEFT adapters could be loaded through the record CLI with policy inference. Since lerobot-record no longer loads policies, the test was removed — but there's no equivalent test for PEFT loading through lerobot-rollout. This is a test coverage gap for a supported workflow.

8. No integration tests for strategy control loops
The tests in test_rollout.py cover configs, ring buffer, state machine, and factories — but none exercise a strategy's run() method, even with mocked hardware. The complex control loops in sentry, highlight, and dagger (episode rotation, push timing, state transitions during recording) are only tested manually per the PR description.

Minor Issues

9. Dead code: _teleop_smooth_move_to
src/lerobot/rollout/strategies/dagger.py:173-194 — This helper is defined but never called. All 5 call sites are commented out with TODO notes. Should either be removed or the TODO resolved.

10. Duplicate frame in Highlight ring buffer after episode save
src/lerobot/rollout/strategies/highlight.py:170-187 — When the user presses save during live recording, the frame at line 170 is added to the dataset and the episode is saved. Then at line 184, _recording_live is now False so the same frame is appended to the ring buffer at line 187. This means the first frame of the next highlight will duplicate the last frame of the previous episode.

11. RolloutRingBuffer single-threaded assumption has no runtime guard
src/lerobot/rollout/ring_buffer.py:41-44 — The docstring notes the buffer is single-threaded, but there's no assertion or lock to catch accidental misuse. Since _current_bytes accounting would break under concurrent access, a lightweight debug-mode check could prevent subtle bugs.

12. context.py:build_rollout_context is ~200 lines
The function performs 8 sequential phases (policy loading, processors, hardware, features, dataset, policy processors, inference, assembly). While the ordering matters for fail-fast behavior, extracting some phases into named helpers would improve readability.

What Looks Good

Thread-safe robot wrapper (robot_wrapper.py) — Clean lock-protected I/O with read-only property proxying. Correct design for RTC concurrent access.
DAgger state machine (dagger.py:88-164) — Transition validation table with thread-safe consume/request pattern is robust and well-tested.
Ring buffer memory accounting (ring_buffer.py) — Proper byte estimation for torch tensors, numpy arrays, and primitives. Eviction loop is correct.
RTC error handling (rtc.py) — Consecutive error counting with threshold, graceful global shutdown propagation, and compile warmup management are production-quality.
Action interpolator extraction — Clean decoupling from RTC module to utils/ with backward-compat re-export.
Signal handler consolidation — ProcessSignalHandler moved from rl/process.py to utils/process.py for shared use across record and rollout.

Summary

This is a significant, well-designed architectural improvement. The separation of lerobot-record (teleop-only) from lerobot-rollout (policy deployment) is a clean split that will reduce maintenance burden and enable independent strategy evolution.

Must-fix before merge: Issues 1-3 (resume broken by timestamp, DAgger dataset validation, correction frame loss).
Should-fix: Issues 4-5 (partial action dict, highlight push lock).
Worth discussing: Issue 6 (global default changes) and test coverage (7-8).

…move rename_map config

imstevenpmwork · 2026-04-22T13:32:32Z

@claude Review this PR

claude · 2026-04-22T13:32:50Z

Claude finished @imstevenpmwork's task in 3m 57s —— View job

PR Review: `lerobot-rollout` CLI

I read all changed files in detail — the new rollout/ module, strategies, inference engines, config/context, record script changes, tests, docs, and examples. This review is against the latest commit d70c3ba.

Overall Assessment

This is a well-designed architectural improvement. The Strategy + Factory + DI decomposition is clean, the InferenceEngine ABC allows backend-agnostic dispatch, and the RolloutContext bundle keeps shared state organized. The code is consistently documented, follows existing LeRobot patterns (draccus ChoiceRegistry, lazy imports, __all__ exports), and the separation of lerobot-record (teleop-only) from lerobot-rollout (policy deployment) is a clean split.

Several issues from the earlier Claude review have been addressed (timestamp handling, DAgger dataset validation, partial action dict check). The issues below are either still present or newly identified.

Bugs & Correctness Issues

1. DAgger correction phase incorrectly applies record_stride to human frames
src/lerobot/rollout/strategies/dagger.py:449 and 624

record_stride = max(1, cfg.interpolation_multiplier)
# ...
if record_tick % record_stride == 0:  # in CORRECTING phase
    dataset.add_frame(frame)
record_tick += 1

When interpolation_multiplier > 1, this skips every Nth human correction frame. But during corrections, the teleoperator provides frames at base FPS — there's no interpolation happening. This discards human demonstration data. record_stride should only apply during the AUTONOMOUS phase; all CORRECTING frames should always be recorded (stride=1). Both _run_continuous (line 449) and _run_corrections_only (line 624) are affected.

Fix this →

2. Highlight strategy: duplicate frame in ring buffer after episode save
src/lerobot/rollout/strategies/highlight.py:170-189
When the user presses save while live recording:

Line 171: dataset.add_frame(frame) — frame is added as last frame of episode
Line 173: dataset.save_episode() — episode is saved
Line 179: self._recording_live.clear() — back to buffering mode
Line 186-189: _recording_live is now False, so the same frame is appended to ring.append(frame) at line 189

The code comment at lines 155-159 claims this doesn't happen ("a frame finalised by save_episode() is never re-added to the next episode") — but that's about re-adding to the current episode. The duplicate is between episodes: the last frame of episode N becomes the first frame buffered for episode N+1. Fix by adding continue or a guard after the save branch.

Fix this →

3. Doc example repo_id violates validation constraint
src/lerobot/rollout/context.py:349 enforces:

if not cfg.dataset.repo_id.startswith("rollout_"):
    raise ValueError(...)

But docs/source/inference.mdx:62 shows:

--dataset.repo_id=${HF_USER}/eval_data

If HF_USER=alice, the repo_id is alice/eval_data, which does not start with rollout_ and would fail validation. This appears in the Sentry doc example. Either the docs need updating to use rollout_ prefixed names, or the validation should extract the repo name after the / before checking the prefix.

Fix this →

Design Concerns

4. Global default constants changed — affects all users
src/lerobot/datasets/utils.py:74-75

DEFAULT_DATA_FILE_SIZE_IN_MB = 50   # was 100
DEFAULT_VIDEO_FILE_SIZE_IN_MB = 100  # was 200

These constants are used throughout the dataset system, not just by rollout. Halving the chunk sizes changes episode rotation behavior for existing pipelines, training scripts, and any code that depends on these defaults. This should be mentioned in the PR description or split into a separate change.

5. PEFT-via-record test removed with no rollout equivalent
tests/test_cli_peft.py — 82 lines deleted (test_peft_record_loads_policy). The PEFT loading path exists in context.py:195-203 but has no automated test. This is a test coverage gap for a supported workflow.

6. No integration tests for strategy control loops
Tests in test_rollout.py cover configs, ring buffer, state machine, and factories — but none exercise a strategy's run() method, even with mocked hardware. The complex control loops in sentry, highlight, and dagger (episode rotation, push timing, state transitions during recording) are only tested manually.

7. Dead code: _teleop_smooth_move_to
src/lerobot/rollout/strategies/dagger.py:173-194 — This helper is defined but never called. All call sites are commented out with TODO notes. Should be removed and added back when the TODO is resolved.

Minor Issues

8. logging.warning vs logger.warning inconsistency
The slow-loop warnings in base.py:77, sentry.py:164, highlight.py:196, dagger.py:512, dagger.py:656 all use logging.warning(...) (the module-level function) instead of logger.warning(...) (the per-module logger). This means these warnings bypass the module's log hierarchy and won't be filterable via lerobot.rollout.strategies.* log config. Every other log call in these files correctly uses logger.

9. context.py:282 — v is float identity check is fragile

observation_features_hw = {
    k: v for k, v in all_obs_features.items()
    if isinstance(v, tuple) or (v is float and k.endswith(".pos"))
}

v is float checks identity (is v literally the float type object?). This works because the feature system uses float as a sentinel, but v == float would be clearer and less surprising to future developers.

What Looks Good

Thread-safe robot wrapper (robot_wrapper.py) — Clean lock-protected I/O with read-only property proxying. Correct design for RTC concurrent access.
DAgger state machine (dagger.py:88-164) — Transition validation table with thread-safe consume/request pattern is robust. Well-tested in test_rollout.py.
Ring buffer memory accounting (ring_buffer.py) — Proper byte estimation for torch tensors, numpy arrays, and primitives. Eviction loop is correct.
RTC error handling (rtc.py) — Consecutive error counting with threshold, graceful global shutdown propagation, and compile warmup management are production-quality.
Action interpolator extraction — Clean decoupling from RTC module to utils/ with backward-compat re-export.
DatasetRecordConfig.stamp_repo_id() — Good fix from previous review: explicit method call instead of __post_init__ side effect. Resume path correctly skips stamping.
Config validation (configs.py:218-317) — Comprehensive fail-fast checks: strategy-specific dataset requirements, streaming encoding enforcement, device/task propagation. The DAgger num_episodes resolution from dataset config is a nice UX touch.
Inference engine ABC (base.py) — Minimal contract with sensible no-op defaults for notify_observation, pause, resume. Makes new backends easy to add.
Background push pattern — Consistent across sentry, highlight, and dagger: single-worker ThreadPoolExecutor with _episode_lock serialization. No push is silently dropped.

Summary

The architectural split is sound and the code quality is high. Three issues should be fixed before merge:

Priority	Issue	Location
Must-fix	DAgger correction frames dropped by `record_stride`	`dagger.py:449, 624`
Must-fix	Highlight duplicate frame in ring buffer after save	`highlight.py:170-189`
Should-fix	Doc/validation mismatch on `rollout_` prefix	`inference.mdx:62` / `context.py:349`
Should-discuss	Global default constant changes (50/100 MB)	`datasets/utils.py:74-75`

The remaining items (dead code, logging inconsistency, missing tests) are non-blocking but worth tracking.

imstevenpmwork · 2026-04-22T14:44:40Z

Claude's important review comments were addressed in commit: 4499519
All the rest are either not true or we can live with them

…olicy frequency instead of interpolator rate Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>

Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>

also reducing default length to 5 minutes

…dataset (#3470)

Maximellerbach · 2026-04-27T15:47:07Z

Tested this PR with:

ACT
SmolVLA

Both trained on this dataset: omx_multicubes_merged

Checked:

base + act (sync)
base + smolvla (sync)
sentry + act (sync) -> example output dataset
highlight + act (sync) -> example output dataset
dagger + act (sync) -> example output dataset
base + smolva (rtc)

imstevenpmwork · 2026-04-27T21:17:03Z

@claude Review this PR

claude · 2026-04-27T21:17:20Z

Claude finished @imstevenpmwork's task in 3m 53s —— View job

PR Review: `lerobot-rollout` CLI

Read all new rollout module files
Read modified files (record script, configs, datasets, policies)
Read tests and examples
Read docs changes
Cross-reference with previous review findings
Post comprehensive review

I read all 63 changed files in detail. This is a third-round review against commit 063645c, cross-referenced with the two prior Claude reviews and the fixes applied since.

Overall Assessment

The architecture is solid and well-executed. The Strategy + Factory + DI decomposition is clean, the InferenceEngine ABC provides backend-agnostic dispatch, and the RolloutContext bundle keeps shared state organized. The code consistently follows existing LeRobot patterns (draccus ChoiceRegistry, lazy imports, __all__ exports).

Many issues from previous reviews have been addressed. This review focuses on remaining and newly identified issues.

Previous Issues — Status

Issue	Status
`DatasetRecordConfig.__post_init__` timestamp breaks resume	Fixed (`stamp_repo_id()` method)
DAgger dataset validation missing	Fixed (added to `needs_dataset`)
Partial action dict silently drops keys	Fixed (now raises `ValueError` at `core.py:299-300`)
Highlight duplicate frame in ring buffer	Fixed (`continue` at `highlight.py:175`)
Doc/validation mismatch on `rollout_` prefix	Fixed (docs now use `rollout_` prefixed names)
`logging.warning` vs `logger.warning` inconsistency	Fixed (all strategy files now use `logger`)
DAgger correction frames dropped by `record_stride`	Acknowledged — author decided to keep current behavior
Global default constants changed (50/100 MB)	Still present — see below
Dead code `_teleop_smooth_move_to`	Still present — minor, tracked by TODO
PEFT test removed with no rollout equivalent	Still missing

Bugs & Correctness Issues

1. estimate_max_episode_seconds includes channel dimension in pixel count
src/lerobot/rollout/strategies/core.py:240-244

# (H, W, C)
# We want to extract the spatial dimensions.
if len(shape) == 3:
    pixels = shape[0] * shape[1] * shape[2]  # includes C

The comment says "extract the spatial dimensions" but the code includes the channel dimension. Bits-per-pixel (conservative_bpp = 0.1) is a per-spatial-pixel metric. Including channels (C=3) inflates the bitrate estimate by 3x, which makes the computed episode duration 3x shorter than intended. Episodes will rotate ~3x more frequently than the target_video_file_size_mb warrants.

Fix: use pixels = shape[0] * shape[1] for spatial-only dimensions.

Fix this →

2. Record script eval_ prefix check doesn't handle namespaced repo IDs
src/lerobot/scripts/lerobot_record.py:392

if cfg.dataset.repo_id.startswith("eval_"):

If repo_id = "user/eval_data", this check passes because the string starts with "user/", not "eval_". The equivalent check in rollout at context.py:353-354 correctly extracts the name part: repo_name = cfg.dataset.repo_id.split("/", 1)[-1]. The record check should be consistent:

repo_name = cfg.dataset.repo_id.split("/", 1)[-1]
if repo_name.startswith("eval_"):

Fix this →

Design Concerns

3. Global default constants halved — affects all existing users
src/lerobot/datasets/utils.py:74-75

DEFAULT_DATA_FILE_SIZE_IN_MB = 50   # was 100
DEFAULT_VIDEO_FILE_SIZE_IN_MB = 100  # was 200

These constants are used throughout the dataset system (not just rollout). Halving them changes episode rotation and file chunking behavior for existing pipelines, training scripts, and any code that depends on these defaults. Combined with issue 1 (3x shorter episodes from channel inclusion), sentry/dagger continuous recording may produce very small episodes. This change should be discussed separately or at minimum called out in the PR description.

4. v is float identity check in feature filtering
src/lerobot/rollout/context.py:286

if isinstance(v, tuple) or (v is float and k.endswith(".pos"))

v is float checks identity — whether v is literally the float type object. This works because the feature system uses float as a sentinel, but it's fragile and surprising. v == float is semantically clearer and less likely to confuse future developers.

5. DAgger record_stride on correction frames (known)
src/lerobot/rollout/strategies/dagger.py:462, 641 — When interpolation_multiplier > 1, human correction frames are subsampled. The author acknowledged this and chose to keep it. Just noting it's still present for future reference — a comment in the code explaining this intentional behavior would help.

Minor Issues

6. Dead code: _teleop_smooth_move_to
src/lerobot/rollout/strategies/dagger.py:182-203 — Defined but never called; all call sites are commented out with TODOs. Low priority, tracked by the TODO.

7. estimate_max_episode_seconds test uses (C, H, W) shape
tests/test_rollout.py:263

features = {"cam": {"dtype": "video", "shape": (3, 480, 640)}}

The function's comment says shapes are (H, W, C), but the test uses (3, 480, 640) which is (C, H, W). Since the function currently multiplies all three dimensions regardless of order, this doesn't affect the result — but it indicates either the comment or the test has the wrong shape convention.

What Looks Good

Clean architectural split — lerobot-record is now teleop-only, lerobot-rollout handles all policy deployment. The boundary is well-defined.
Thread-safe robot wrapper (robot_wrapper.py) — Lock-protected I/O with read-only property proxying. Correct for RTC concurrent access.
DAgger state machine (dagger.py:88-164) — Transition validation table with thread-safe consume/request is robust and well-tested.
Inference engine ABC (base.py) — Minimal contract with sensible no-op defaults for notify_observation, pause, resume. Makes new backends easy to add.
stamp_repo_id() fix — Clean separation: explicit method call at create time, skipped on resume. Good fix from prior review.
DatasetRecordConfig extraction — Shared between record and rollout, eliminates config duplication.
Action interpolator extraction — Clean decoupling from RTC module to utils/ with backward-compat re-export.
RTC error handling (rtc.py) — Consecutive error counting, graceful global shutdown propagation, compile warmup management. Production-quality.
Background push pattern — Consistent across sentry, highlight, dagger: single-worker ThreadPoolExecutor with _episode_lock serialization.
Observation processing throttling (core.py:72-93) — _process_observation_and_notify only runs the expensive pipeline on policy ticks, caching results for interpolated ticks.
Config validation (configs.py:224-319) — Comprehensive fail-fast checks including streaming enforcement per strategy, DAgger num_episodes resolution, device propagation.
Examples — so100_to_so100_EE/rollout.py and phone_to_so100/rollout.py are clean demonstrations of the programmatic API with custom kinematics processors.
Docs — inference.mdx is thorough with clear examples for every strategy and backend combination.

Summary

This is a well-designed, significant architectural improvement. The separation of concerns is clean, the code is consistently documented, and hardware testing confirms no regressions.

Priority	Issue	Location
Should-fix	Episode duration 3x shorter due to channel dim in pixel count	`core.py:243`
Should-fix	`eval_` prefix check bypassed for namespaced repo IDs	`lerobot_record.py:392`
Worth discussing	Global default constants halved (50/100 MB)	`datasets/utils.py:74-75`
Minor	`v is float` identity check fragile	`context.py:286`
Minor	Test shape convention mismatch	`test_rollout.py:263`

The remaining items are non-blocking. Overall this is in good shape for merge after addressing issues 1-2.

Resolve conflicts: - groot_n1.py: take upstream's plain class annotations (the @DataClass decorator was removed upstream, so field() defaults no longer apply). - lerobot_record.py: keep HEAD's full file. Upstream's rollout-decoupling refactor (huggingface#3413) removes policy support from lerobot_record entirely, but HEAD has multiple features built on policy-in-record (action interpolation, torque visualizer with policy mode, friction comp, record_effort). The new lerobot-rollout system from upstream is still added as a parallel path; existing lerobot_record features are preserved. - uv.lock: regenerated.

feat(scripts): lerobot-rollout

bc06cb4

imstevenpmwork self-assigned this Apr 19, 2026

imstevenpmwork marked this pull request as ready for review April 19, 2026 23:01

imstevenpmwork requested review from pkooij and s1lent4gnt April 19, 2026 23:03

huggingface deleted a comment from claude Bot Apr 19, 2026

This comment was marked as resolved.

Sign in to view

imstevenpmwork added 7 commits April 20, 2026 15:34

fix(rollout) require dataset in dagger + use duration too

14c7a25

fix(docs): dagger num_episodes

02d8a34

test(rollout): fix expectations

b49e401

fix(rollout): features check

195b777

fix(rollout): device and task propagation + feature pos + warn fps + …

8d1401a

…move rename_map config

docs(rollout): edit rename_map instructions

bff2d50

chore(rollout): multiple minor improvements

d70c3ba

imstevenpmwork and others added 2 commits April 22, 2026 16:37

chore(rollout): address coments + minor improvements

4499519

Merge branch 'main' into feat/lerobot-rollout

6ee093d

Merge branch 'main' into feat/lerobot-rollout

682d5f2

imstevenpmwork and others added 4 commits April 22, 2026 18:47

fix(rollout): enable default

f29ce39

fix(tests): default value RTCConfig

eb9519e

fix(rollout): robot_observation_processor and notify_observation at p…

8c4d4c8

…olicy frequency instead of interpolator rate Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>

fix(rollout): prevent relativeactions with sync inference engine

96471b5

Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>

imstevenpmwork force-pushed the feat/lerobot-rollout branch from c59c737 to 96471b5 Compare April 27, 2026 11:15

fix(rollout): rtc reanchor to non normalized state

c505a6b

Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>

github-actions Bot added the processor Issue related to processor label Apr 27, 2026

Maximellerbach and others added 4 commits April 27, 2026 15:42

fix(rollout): fixing the episode length to use hwc (#3469)

bef6796

also reducing default length to 5 minutes

feat(rollout): go back to initial position is now a config

7451813

fix(rollout): properly propagating video_files_size_in_mb to lerobot_…

bf5148c

…dataset (#3470)

chore(rollout): note about dagger correction stage

d662113

imstevenpmwork and others added 3 commits April 27, 2026 18:58

chore(docs): update comments and docstring

629781e

fix(test): move rtc relative out of rollout module

f7434d8

Merge branch 'main' into feat/lerobot-rollout

063645c

fix(rollout): address the review comments

c9601aa

imstevenpmwork merged commit ca87ccd into main Apr 27, 2026
21 checks passed

imstevenpmwork deleted the feat/lerobot-rollout branch April 27, 2026 22:57

samet-akcay mentioned this pull request Apr 29, 2026

feat(export): add modular policy export with ONNX and OpenVINO runtime samet-akcay/lerobot#5

Closed

samet-akcay mentioned this pull request May 1, 2026

feat(export): add modular policy export with ONNX and OpenVINO runtime #3493

Open

imstevenpmwork mentioned this pull request May 7, 2026

Release 0.6.0 #3134

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rollout): decouple policy deployment from data recording with new `lerobot-rollout` CLI#3413

feat(rollout): decouple policy deployment from data recording with new `lerobot-rollout` CLI#3413
imstevenpmwork merged 24 commits intomainfrom
feat/lerobot-rollout

imstevenpmwork commented Apr 19, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 19, 2026

Uh oh!

imstevenpmwork commented Apr 20, 2026

Uh oh!

claude Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

imstevenpmwork commented Apr 22, 2026

Uh oh!

claude Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

imstevenpmwork commented Apr 22, 2026

Uh oh!

Maximellerbach commented Apr 27, 2026

Uh oh!

imstevenpmwork commented Apr 27, 2026

Uh oh!

claude Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

imstevenpmwork commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary / Motivation

Related issues

What changed

How was this tested (or how to run locally)

Checklist (required before merge)

Reviewer notes

Uh oh!

HuggingFaceDocBuilderDev commented Apr 19, 2026

Uh oh!

imstevenpmwork commented Apr 20, 2026

Uh oh!

claude Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: lerobot-rollout CLI

Overall Assessment

Bugs & Correctness Issues

Design Concerns

Minor Issues

What Looks Good

Summary

Uh oh!

This comment was marked as resolved.

Uh oh!

imstevenpmwork commented Apr 22, 2026

Uh oh!

claude Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: lerobot-rollout CLI

Overall Assessment

Bugs & Correctness Issues

Design Concerns

Minor Issues

What Looks Good

Summary

Uh oh!

imstevenpmwork commented Apr 22, 2026

Uh oh!

Maximellerbach commented Apr 27, 2026

Uh oh!

imstevenpmwork commented Apr 27, 2026

Uh oh!

claude Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: lerobot-rollout CLI

Overall Assessment

Previous Issues — Status

Bugs & Correctness Issues

Design Concerns

Minor Issues

What Looks Good

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

imstevenpmwork commented Apr 19, 2026 •

edited

Loading

claude Bot commented Apr 20, 2026 •

edited

Loading

PR Review: `lerobot-rollout` CLI

claude Bot commented Apr 22, 2026 •

edited

Loading

PR Review: `lerobot-rollout` CLI

claude Bot commented Apr 27, 2026 •

edited

Loading

PR Review: `lerobot-rollout` CLI