Skip to content

feat(rollout): decouple policy deployment from data recording with new lerobot-rollout CLI#3413

Merged
imstevenpmwork merged 24 commits intomainfrom
feat/lerobot-rollout
Apr 27, 2026
Merged

feat(rollout): decouple policy deployment from data recording with new lerobot-rollout CLI#3413
imstevenpmwork merged 24 commits intomainfrom
feat/lerobot-rollout

Conversation

@imstevenpmwork
Copy link
Copy Markdown
Collaborator

@imstevenpmwork imstevenpmwork commented Apr 19, 2026

Summary / Motivation

  • Introduces a clean architectural split between data collection (lerobot-record) and policy deployment (lerobot-rollout). Previously, lerobot-record handled teleoperation, policy inference, RTC async inference, and DAgger-style human-in-the-loop corrections in a single script with growing conditional complexity. This PR extracts all policy-related execution into a new lerobot.rollout module with a dedicated CLI, leaving lerobot-record as a focused teleoperation-only tool.
  • The new module follows a Strategy + Factory + Dependency Injection architecture: strategies are pluggable (base, sentry, highlight, dagger), inference backends are swappable (sync, rtc), and all shared state is assembled once into a RolloutContext bundle. This makes each deployment mode independently testable and extensible without touching the others.

Related issues

  • Related: N/A

What changed

  • New src/lerobot/rollout/ module — Complete policy deployment engine with:
    • 4 strategies: BaseStrategy (autonomous, no recording), SentryStrategy (continuous recording + auto-upload), HighlightStrategy (ring buffer + keystroke save), DAggerStrategy (human-in-the-loop RaC with keyboard/pedal input)
    • 2 inference engines: SyncInferenceEngine (inline) and RTCInferenceEngine (async background thread with action queue)
    • RolloutContext dependency injection bundle with sub-contexts (runtime, hardware, policy, processor, dataset)
    • ThreadSafeRobot wrapper for lock-protected concurrent hardware access
    • RolloutRingBuffer memory-bounded circular buffer for highlight capture
    • Polymorphic CLI dispatch via draccus ChoiceRegistry (--strategy.type=..., --inference.type=...)
  • New lerobot-rollout CLI (src/lerobot/scripts/lerobot_rollout.py) registered in pyproject.toml
  • Simplified lerobot-record — Stripped all policy/inference code; teleop is now required.
  • Extracted shared utilities:
    • DatasetRecordConfigsrc/lerobot/configs/dataset.py (shared between record and rollout)
    • ActionInterpolatorsrc/lerobot/utils/action_interpolator.py (decoupled from RTC policy module)
    • start_pedal_listenersrc/lerobot/utils/pedal.py (evdev foot pedal support with graceful degradation)
  • Thread safety fix in policies/rtc/action_queue.py — added missing lock guards on qsize(), empty(), get_action_index()
  • Removed examples/hil/hil_data_collection.py (1184 lines), examples/hil/hil_utils.py, examples/rtc/eval_with_real_robot.py (673 lines) — replaced by lerobot-rollout
  • New examplesrollout.py and updated evaluate.py for lekiwi, phone-to-so100, and so100-EE setups
  • New docsdocs/source/inference.mdx covering all strategies and examples; updated hil_data_collection.mdx, il_robots.mdx, rtc.mdx

How was this tested (or how to run locally)

Checklist (required before merge)

  • Linting/formatting run (pre-commit run -a)
  • All tests pass locally (pytest)
  • Documentation updated
  • CI is green
  • Community Review: I have reviewed another contributor's open PR and linked it here: #

Reviewer notes

  • RTC implementation has a lot of room for improvement. I intentionally tried to touch policies/rtc as least as possible. We can go back to improving it in the near-future.
  • Tests are minimal, but it is because we will refactor the tests/ strategy repository-wide in the future.
  • Anyone in the community is free to review the PR.
  • More development details in Feat/decouple record script #3380

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@imstevenpmwork imstevenpmwork self-assigned this Apr 19, 2026
@imstevenpmwork imstevenpmwork marked this pull request as ready for review April 19, 2026 23:01
@github-actions github-actions Bot added documentation Improvements or fixes to the project’s docs policies Items related to robot policies dataset Issues regarding data inputs, processing, or datasets tests Problems with test coverage, failures, or improvements to testing configuration Problems with configuration files or settings examples Issues related to the examples labels Apr 19, 2026
@huggingface huggingface deleted a comment from claude Bot Apr 19, 2026
@imstevenpmwork
Copy link
Copy Markdown
Collaborator Author

@claude Review this PR

@claude
Copy link
Copy Markdown

claude Bot commented Apr 20, 2026

Claude finished @imstevenpmwork's task in 4m 1s —— View job


PR Review: lerobot-rollout CLI

I read all 54 changed files in detail. This is a well-structured PR that cleanly separates policy deployment from data recording. The Strategy + Factory + DI architecture is solid and the code is well-documented. Below is my full technical review.


Overall Assessment

Architecture: The decomposition into strategies (base/sentry/highlight/dagger), inference backends (sync/rtc), and context injection is clean and extensible. Each strategy is self-contained and the InferenceEngine ABC allows backend-agnostic action dispatch via send_next_action. Strong separation of concerns.

Code quality: Consistently documented, good logging, clean module boundaries. The codebase follows existing LeRobot patterns (draccus ChoiceRegistry, lazy imports, __all__ exports).


Bugs & Correctness Issues

1. DatasetRecordConfig.__post_init__ always appends timestamp — breaks --resume
src/lerobot/configs/dataset.py:74-77

def __post_init__(self) -> None:
    if self.repo_id:
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        self.repo_id = f"{self.repo_id}_{timestamp}"

When --resume=true, build_rollout_context calls LeRobotDataset.resume(cfg.dataset.repo_id) — but repo_id already has a fresh timestamp appended, so it will attempt to resume a dataset that doesn't exist. The timestamp should be conditional on not resuming, or this logic should be handled at the call site rather than in __post_init__.

Fix this →

2. DAgger strategy missing dataset validation in RolloutConfig.__post_init__
src/lerobot/rollout/configs.py:191-193

needs_dataset = isinstance(self.strategy, (SentryStrategyConfig, HighlightStrategyConfig))

DAggerStrategyConfig is not included in needs_dataset, but both _run_continuous and _run_corrections_only call dataset.add_frame() and dataset.save_episode() unconditionally. A missing --dataset.repo_id with DAgger will crash at runtime instead of failing fast at config validation.

Fix this →

3. DAgger correction phase incorrectly applies record_stride to human frames
src/lerobot/rollout/strategies/dagger.py:449 and 615

record_stride = max(1, cfg.interpolation_multiplier)
# ...
if record_tick % record_stride == 0:  # in CORRECTING phase
    dataset.add_frame(frame)
record_tick += 1

When interpolation_multiplier > 1, this skips every Nth human correction frame. But during corrections, the teleoperator provides frames at base FPS — there's no interpolation. This would discard human demonstration data. record_stride should only apply during the autonomous phase.

Fix this →

4. send_next_action silently produces partial action dicts
src/lerobot/rollout/strategies/core.py:269

action_dict = {k: interp[i].item() for i, k in enumerate(ordered_keys) if i < len(interp)}

If the interpolated tensor has fewer elements than ordered_action_keys, some keys are silently dropped. Sending a partial action dict to the robot could cause unpredictable hardware behavior. This should be a hard error, or at minimum a warning.

5. Highlight strategy has no episode lock on background push
src/lerobot/rollout/strategies/highlight.py:257-277_background_push calls safe_push_to_hub without _episode_lock, unlike Sentry (sentry.py:211-222) and DAgger (dagger.py:719-730) which both hold the lock. If the main loop calls dataset.save_episode() concurrently with a push, there could be a data race on the dataset's on-disk state.


Design Concerns

6. Global default constants changed — affects all users
src/lerobot/datasets/utils.py:74-75

DEFAULT_DATA_FILE_SIZE_IN_MB = 50   # was 100
DEFAULT_VIDEO_FILE_SIZE_IN_MB = 100  # was 200

These constants are used throughout the dataset system, not just by rollout. Halving the chunk sizes changes episode rotation behavior for existing pipelines, training scripts, and any code that depends on these defaults. This should be called out in the PR description and potentially discussed separately.

7. PEFT-via-record test removed with no rollout equivalent
tests/test_cli_peft.py — 82 lines deleted (test_peft_record_loads_policy). This verified that PEFT adapters could be loaded through the record CLI with policy inference. Since lerobot-record no longer loads policies, the test was removed — but there's no equivalent test for PEFT loading through lerobot-rollout. This is a test coverage gap for a supported workflow.

8. No integration tests for strategy control loops
The tests in test_rollout.py cover configs, ring buffer, state machine, and factories — but none exercise a strategy's run() method, even with mocked hardware. The complex control loops in sentry, highlight, and dagger (episode rotation, push timing, state transitions during recording) are only tested manually per the PR description.


Minor Issues

9. Dead code: _teleop_smooth_move_to
src/lerobot/rollout/strategies/dagger.py:173-194 — This helper is defined but never called. All 5 call sites are commented out with TODO notes. Should either be removed or the TODO resolved.

10. Duplicate frame in Highlight ring buffer after episode save
src/lerobot/rollout/strategies/highlight.py:170-187 — When the user presses save during live recording, the frame at line 170 is added to the dataset and the episode is saved. Then at line 184, _recording_live is now False so the same frame is appended to the ring buffer at line 187. This means the first frame of the next highlight will duplicate the last frame of the previous episode.

11. RolloutRingBuffer single-threaded assumption has no runtime guard
src/lerobot/rollout/ring_buffer.py:41-44 — The docstring notes the buffer is single-threaded, but there's no assertion or lock to catch accidental misuse. Since _current_bytes accounting would break under concurrent access, a lightweight debug-mode check could prevent subtle bugs.

12. context.py:build_rollout_context is ~200 lines
The function performs 8 sequential phases (policy loading, processors, hardware, features, dataset, policy processors, inference, assembly). While the ordering matters for fail-fast behavior, extracting some phases into named helpers would improve readability.


What Looks Good

  • Thread-safe robot wrapper (robot_wrapper.py) — Clean lock-protected I/O with read-only property proxying. Correct design for RTC concurrent access.
  • DAgger state machine (dagger.py:88-164) — Transition validation table with thread-safe consume/request pattern is robust and well-tested.
  • Ring buffer memory accounting (ring_buffer.py) — Proper byte estimation for torch tensors, numpy arrays, and primitives. Eviction loop is correct.
  • RTC error handling (rtc.py) — Consecutive error counting with threshold, graceful global shutdown propagation, and compile warmup management are production-quality.
  • Action interpolator extraction — Clean decoupling from RTC module to utils/ with backward-compat re-export.
  • Signal handler consolidationProcessSignalHandler moved from rl/process.py to utils/process.py for shared use across record and rollout.

Summary

This is a significant, well-designed architectural improvement. The separation of lerobot-record (teleop-only) from lerobot-rollout (policy deployment) is a clean split that will reduce maintenance burden and enable independent strategy evolution.

Must-fix before merge: Issues 1-3 (resume broken by timestamp, DAgger dataset validation, correction frame loss).
Should-fix: Issues 4-5 (partial action dict, highlight push lock).
Worth discussing: Issue 6 (global default changes) and test coverage (7-8).

whats2000

This comment was marked as resolved.

@imstevenpmwork
Copy link
Copy Markdown
Collaborator Author

@claude Review this PR

@claude
Copy link
Copy Markdown

claude Bot commented Apr 22, 2026

Claude finished @imstevenpmwork's task in 3m 57s —— View job


PR Review: lerobot-rollout CLI

I read all changed files in detail — the new rollout/ module, strategies, inference engines, config/context, record script changes, tests, docs, and examples. This review is against the latest commit d70c3ba.


Overall Assessment

This is a well-designed architectural improvement. The Strategy + Factory + DI decomposition is clean, the InferenceEngine ABC allows backend-agnostic dispatch, and the RolloutContext bundle keeps shared state organized. The code is consistently documented, follows existing LeRobot patterns (draccus ChoiceRegistry, lazy imports, __all__ exports), and the separation of lerobot-record (teleop-only) from lerobot-rollout (policy deployment) is a clean split.

Several issues from the earlier Claude review have been addressed (timestamp handling, DAgger dataset validation, partial action dict check). The issues below are either still present or newly identified.


Bugs & Correctness Issues

1. DAgger correction phase incorrectly applies record_stride to human frames
src/lerobot/rollout/strategies/dagger.py:449 and 624

record_stride = max(1, cfg.interpolation_multiplier)
# ...
if record_tick % record_stride == 0:  # in CORRECTING phase
    dataset.add_frame(frame)
record_tick += 1

When interpolation_multiplier > 1, this skips every Nth human correction frame. But during corrections, the teleoperator provides frames at base FPS — there's no interpolation happening. This discards human demonstration data. record_stride should only apply during the AUTONOMOUS phase; all CORRECTING frames should always be recorded (stride=1). Both _run_continuous (line 449) and _run_corrections_only (line 624) are affected.

Fix this →

2. Highlight strategy: duplicate frame in ring buffer after episode save
src/lerobot/rollout/strategies/highlight.py:170-189
When the user presses save while live recording:

  • Line 171: dataset.add_frame(frame) — frame is added as last frame of episode
  • Line 173: dataset.save_episode() — episode is saved
  • Line 179: self._recording_live.clear() — back to buffering mode
  • Line 186-189: _recording_live is now False, so the same frame is appended to ring.append(frame) at line 189

The code comment at lines 155-159 claims this doesn't happen ("a frame finalised by save_episode() is never re-added to the next episode") — but that's about re-adding to the current episode. The duplicate is between episodes: the last frame of episode N becomes the first frame buffered for episode N+1. Fix by adding continue or a guard after the save branch.

Fix this →

3. Doc example repo_id violates validation constraint
src/lerobot/rollout/context.py:349 enforces:

if not cfg.dataset.repo_id.startswith("rollout_"):
    raise ValueError(...)

But docs/source/inference.mdx:62 shows:

--dataset.repo_id=${HF_USER}/eval_data

If HF_USER=alice, the repo_id is alice/eval_data, which does not start with rollout_ and would fail validation. This appears in the Sentry doc example. Either the docs need updating to use rollout_ prefixed names, or the validation should extract the repo name after the / before checking the prefix.

Fix this →


Design Concerns

4. Global default constants changed — affects all users
src/lerobot/datasets/utils.py:74-75

DEFAULT_DATA_FILE_SIZE_IN_MB = 50   # was 100
DEFAULT_VIDEO_FILE_SIZE_IN_MB = 100  # was 200

These constants are used throughout the dataset system, not just by rollout. Halving the chunk sizes changes episode rotation behavior for existing pipelines, training scripts, and any code that depends on these defaults. This should be mentioned in the PR description or split into a separate change.

5. PEFT-via-record test removed with no rollout equivalent
tests/test_cli_peft.py — 82 lines deleted (test_peft_record_loads_policy). The PEFT loading path exists in context.py:195-203 but has no automated test. This is a test coverage gap for a supported workflow.

6. No integration tests for strategy control loops
Tests in test_rollout.py cover configs, ring buffer, state machine, and factories — but none exercise a strategy's run() method, even with mocked hardware. The complex control loops in sentry, highlight, and dagger (episode rotation, push timing, state transitions during recording) are only tested manually.

7. Dead code: _teleop_smooth_move_to
src/lerobot/rollout/strategies/dagger.py:173-194 — This helper is defined but never called. All call sites are commented out with TODO notes. Should be removed and added back when the TODO is resolved.


Minor Issues

8. logging.warning vs logger.warning inconsistency
The slow-loop warnings in base.py:77, sentry.py:164, highlight.py:196, dagger.py:512, dagger.py:656 all use logging.warning(...) (the module-level function) instead of logger.warning(...) (the per-module logger). This means these warnings bypass the module's log hierarchy and won't be filterable via lerobot.rollout.strategies.* log config. Every other log call in these files correctly uses logger.

9. context.py:282v is float identity check is fragile

observation_features_hw = {
    k: v for k, v in all_obs_features.items()
    if isinstance(v, tuple) or (v is float and k.endswith(".pos"))
}

v is float checks identity (is v literally the float type object?). This works because the feature system uses float as a sentinel, but v == float would be clearer and less surprising to future developers.


What Looks Good

  • Thread-safe robot wrapper (robot_wrapper.py) — Clean lock-protected I/O with read-only property proxying. Correct design for RTC concurrent access.
  • DAgger state machine (dagger.py:88-164) — Transition validation table with thread-safe consume/request pattern is robust. Well-tested in test_rollout.py.
  • Ring buffer memory accounting (ring_buffer.py) — Proper byte estimation for torch tensors, numpy arrays, and primitives. Eviction loop is correct.
  • RTC error handling (rtc.py) — Consecutive error counting with threshold, graceful global shutdown propagation, and compile warmup management are production-quality.
  • Action interpolator extraction — Clean decoupling from RTC module to utils/ with backward-compat re-export.
  • DatasetRecordConfig.stamp_repo_id() — Good fix from previous review: explicit method call instead of __post_init__ side effect. Resume path correctly skips stamping.
  • Config validation (configs.py:218-317) — Comprehensive fail-fast checks: strategy-specific dataset requirements, streaming encoding enforcement, device/task propagation. The DAgger num_episodes resolution from dataset config is a nice UX touch.
  • Inference engine ABC (base.py) — Minimal contract with sensible no-op defaults for notify_observation, pause, resume. Makes new backends easy to add.
  • Background push pattern — Consistent across sentry, highlight, and dagger: single-worker ThreadPoolExecutor with _episode_lock serialization. No push is silently dropped.

Summary

The architectural split is sound and the code quality is high. Three issues should be fixed before merge:

Priority Issue Location
Must-fix DAgger correction frames dropped by record_stride dagger.py:449, 624
Must-fix Highlight duplicate frame in ring buffer after save highlight.py:170-189
Should-fix Doc/validation mismatch on rollout_ prefix inference.mdx:62 / context.py:349
Should-discuss Global default constant changes (50/100 MB) datasets/utils.py:74-75

The remaining items (dead code, logging inconsistency, missing tests) are non-blocking but worth tracking.


@imstevenpmwork
Copy link
Copy Markdown
Collaborator Author

Claude's important review comments were addressed in commit: 4499519
All the rest are either not true or we can live with them

imstevenpmwork and others added 4 commits April 22, 2026 18:47
…olicy frequency instead of interpolator rate

Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
Co-authored-by: Pepijn <138571049+pkooij@users.noreply.github.com>
@github-actions github-actions Bot added the processor Issue related to processor label Apr 27, 2026
@Maximellerbach
Copy link
Copy Markdown
Member

Tested this PR with:

Both trained on this dataset: omx_multicubes_merged

Checked:

@imstevenpmwork
Copy link
Copy Markdown
Collaborator Author

@claude Review this PR

@claude
Copy link
Copy Markdown

claude Bot commented Apr 27, 2026

Claude finished @imstevenpmwork's task in 3m 53s —— View job


PR Review: lerobot-rollout CLI

  • Read all new rollout module files
  • Read modified files (record script, configs, datasets, policies)
  • Read tests and examples
  • Read docs changes
  • Cross-reference with previous review findings
  • Post comprehensive review

I read all 63 changed files in detail. This is a third-round review against commit 063645c, cross-referenced with the two prior Claude reviews and the fixes applied since.


Overall Assessment

The architecture is solid and well-executed. The Strategy + Factory + DI decomposition is clean, the InferenceEngine ABC provides backend-agnostic dispatch, and the RolloutContext bundle keeps shared state organized. The code consistently follows existing LeRobot patterns (draccus ChoiceRegistry, lazy imports, __all__ exports).

Many issues from previous reviews have been addressed. This review focuses on remaining and newly identified issues.


Previous Issues — Status

Issue Status
DatasetRecordConfig.__post_init__ timestamp breaks resume Fixed (stamp_repo_id() method)
DAgger dataset validation missing Fixed (added to needs_dataset)
Partial action dict silently drops keys Fixed (now raises ValueError at core.py:299-300)
Highlight duplicate frame in ring buffer Fixed (continue at highlight.py:175)
Doc/validation mismatch on rollout_ prefix Fixed (docs now use rollout_ prefixed names)
logging.warning vs logger.warning inconsistency Fixed (all strategy files now use logger)
DAgger correction frames dropped by record_stride Acknowledged — author decided to keep current behavior
Global default constants changed (50/100 MB) Still present — see below
Dead code _teleop_smooth_move_to Still present — minor, tracked by TODO
PEFT test removed with no rollout equivalent Still missing

Bugs & Correctness Issues

1. estimate_max_episode_seconds includes channel dimension in pixel count
src/lerobot/rollout/strategies/core.py:240-244

# (H, W, C)
# We want to extract the spatial dimensions.
if len(shape) == 3:
    pixels = shape[0] * shape[1] * shape[2]  # includes C

The comment says "extract the spatial dimensions" but the code includes the channel dimension. Bits-per-pixel (conservative_bpp = 0.1) is a per-spatial-pixel metric. Including channels (C=3) inflates the bitrate estimate by 3x, which makes the computed episode duration 3x shorter than intended. Episodes will rotate ~3x more frequently than the target_video_file_size_mb warrants.

Fix: use pixels = shape[0] * shape[1] for spatial-only dimensions.

Fix this →

2. Record script eval_ prefix check doesn't handle namespaced repo IDs
src/lerobot/scripts/lerobot_record.py:392

if cfg.dataset.repo_id.startswith("eval_"):

If repo_id = "user/eval_data", this check passes because the string starts with "user/", not "eval_". The equivalent check in rollout at context.py:353-354 correctly extracts the name part: repo_name = cfg.dataset.repo_id.split("/", 1)[-1]. The record check should be consistent:

repo_name = cfg.dataset.repo_id.split("/", 1)[-1]
if repo_name.startswith("eval_"):

Fix this →


Design Concerns

3. Global default constants halved — affects all existing users
src/lerobot/datasets/utils.py:74-75

DEFAULT_DATA_FILE_SIZE_IN_MB = 50   # was 100
DEFAULT_VIDEO_FILE_SIZE_IN_MB = 100  # was 200

These constants are used throughout the dataset system (not just rollout). Halving them changes episode rotation and file chunking behavior for existing pipelines, training scripts, and any code that depends on these defaults. Combined with issue 1 (3x shorter episodes from channel inclusion), sentry/dagger continuous recording may produce very small episodes. This change should be discussed separately or at minimum called out in the PR description.

4. v is float identity check in feature filtering
src/lerobot/rollout/context.py:286

if isinstance(v, tuple) or (v is float and k.endswith(".pos"))

v is float checks identity — whether v is literally the float type object. This works because the feature system uses float as a sentinel, but it's fragile and surprising. v == float is semantically clearer and less likely to confuse future developers.

5. DAgger record_stride on correction frames (known)
src/lerobot/rollout/strategies/dagger.py:462, 641 — When interpolation_multiplier > 1, human correction frames are subsampled. The author acknowledged this and chose to keep it. Just noting it's still present for future reference — a comment in the code explaining this intentional behavior would help.


Minor Issues

6. Dead code: _teleop_smooth_move_to
src/lerobot/rollout/strategies/dagger.py:182-203 — Defined but never called; all call sites are commented out with TODOs. Low priority, tracked by the TODO.

7. estimate_max_episode_seconds test uses (C, H, W) shape
tests/test_rollout.py:263

features = {"cam": {"dtype": "video", "shape": (3, 480, 640)}}

The function's comment says shapes are (H, W, C), but the test uses (3, 480, 640) which is (C, H, W). Since the function currently multiplies all three dimensions regardless of order, this doesn't affect the result — but it indicates either the comment or the test has the wrong shape convention.


What Looks Good

  • Clean architectural splitlerobot-record is now teleop-only, lerobot-rollout handles all policy deployment. The boundary is well-defined.
  • Thread-safe robot wrapper (robot_wrapper.py) — Lock-protected I/O with read-only property proxying. Correct for RTC concurrent access.
  • DAgger state machine (dagger.py:88-164) — Transition validation table with thread-safe consume/request is robust and well-tested.
  • Inference engine ABC (base.py) — Minimal contract with sensible no-op defaults for notify_observation, pause, resume. Makes new backends easy to add.
  • stamp_repo_id() fix — Clean separation: explicit method call at create time, skipped on resume. Good fix from prior review.
  • DatasetRecordConfig extraction — Shared between record and rollout, eliminates config duplication.
  • Action interpolator extraction — Clean decoupling from RTC module to utils/ with backward-compat re-export.
  • RTC error handling (rtc.py) — Consecutive error counting, graceful global shutdown propagation, compile warmup management. Production-quality.
  • Background push pattern — Consistent across sentry, highlight, dagger: single-worker ThreadPoolExecutor with _episode_lock serialization.
  • Observation processing throttling (core.py:72-93) — _process_observation_and_notify only runs the expensive pipeline on policy ticks, caching results for interpolated ticks.
  • Config validation (configs.py:224-319) — Comprehensive fail-fast checks including streaming enforcement per strategy, DAgger num_episodes resolution, device propagation.
  • Examplesso100_to_so100_EE/rollout.py and phone_to_so100/rollout.py are clean demonstrations of the programmatic API with custom kinematics processors.
  • Docsinference.mdx is thorough with clear examples for every strategy and backend combination.

Summary

This is a well-designed, significant architectural improvement. The separation of concerns is clean, the code is consistently documented, and hardware testing confirms no regressions.

Priority Issue Location
Should-fix Episode duration 3x shorter due to channel dim in pixel count core.py:243
Should-fix eval_ prefix check bypassed for namespaced repo IDs lerobot_record.py:392
Worth discussing Global default constants halved (50/100 MB) datasets/utils.py:74-75
Minor v is float identity check fragile context.py:286
Minor Test shape convention mismatch test_rollout.py:263

The remaining items are non-blocking. Overall this is in good shape for merge after addressing issues 1-2.


@imstevenpmwork imstevenpmwork merged commit ca87ccd into main Apr 27, 2026
21 checks passed
@imstevenpmwork imstevenpmwork deleted the feat/lerobot-rollout branch April 27, 2026 22:57
hcisbmm added a commit to hcisbmm/lerobot that referenced this pull request Apr 30, 2026
Resolve conflicts:
- groot_n1.py: take upstream's plain class annotations (the @DataClass
  decorator was removed upstream, so field() defaults no longer apply).
- lerobot_record.py: keep HEAD's full file. Upstream's rollout-decoupling
  refactor (huggingface#3413) removes policy support from lerobot_record entirely,
  but HEAD has multiple features built on policy-in-record (action
  interpolation, torque visualizer with policy mode, friction comp,
  record_effort). The new lerobot-rollout system from upstream is still
  added as a parallel path; existing lerobot_record features are preserved.
- uv.lock: regenerated.
@imstevenpmwork imstevenpmwork mentioned this pull request May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

configuration Problems with configuration files or settings dataset Issues regarding data inputs, processing, or datasets documentation Improvements or fixes to the project’s docs examples Issues related to the examples policies Items related to robot policies processor Issue related to processor tests Problems with test coverage, failures, or improvements to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants