Added poetry setup by aliberts · Pull Request #2 · huggingface/lerobot

aliberts · 2024-02-28T18:14:26Z

Added poetry setup

Merge from main

[WIP] Added MLPBC policy

Added poetry setup

Initial supports for ar teleop

…int (task huggingface#1) Introduce packaging wiring for the upcoming web dashboard: - New `dashboard` optional extra (FastAPI, uvicorn, aiortc, av, websockets, pydantic, pyserial, aiofiles, python-multipart, huggingface-hub) that also pulls in lerobot[core_scripts] for CLI tooling reuse. - Include `dashboard` in the aggregate `all` extra. - Register `lerobot-dashboard` console script -> lerobot.dashboard.cli:main. - Create empty package skeleton (src/lerobot/dashboard/__init__.py and a placeholder cli.py) so the console script resolves. Full FastAPI app + uvicorn runner land in task huggingface#2. - Regenerate uv.lock.

Build the dashboard server scaffolding: - `lerobot.dashboard.app:create_app` FastAPI factory with lifespan, permissive localhost CORS, /api router mount, /ws router mount, and optional StaticFiles mount at `/` for the built frontend. - `core.config.DashboardConfig` dataclass resolved from CLI args and round-tripped through env vars so uvicorn reload workers rehydrate it. - `core.state.AppState` placeholder for registry / live device handles attached to `app.state.dashboard` (task huggingface#5 fills in the registry). - `storage.paths` with $XDG/$HOME/env-override resolution. - `api.health` returns status, version, python, uptime_seconds. - `ws.router` reserves /ws/events (heartbeat loop) and /ws/teleop (stub — streaming-engineer owns the real protocol in task huggingface#11). - `cli.main` argparse wrapper around `uvicorn.run(..., factory=True)` with --host/--port/--reload/--storage-dir/--static-dir/--cors-origin/ --log-level flags. - tests/dashboard/test_app.py: /api/health 200, CORS preflight, and /ws/events heartbeat smoke tests via FastAPI TestClient.

Team-coordination add-ons on top of tasks huggingface#1/huggingface#2: - Add `ur-rtde>=1.5.7` to the dashboard extra (robotics-integrator Task huggingface#6 needs it for UR7e RTDE). Regenerate uv.lock. - DashboardConfig gains `fake_devices` / `fake_policy` booleans driven by `LEROBOT_DASHBOARD_FAKE_DEVICES` / `LEROBOT_DASHBOARD_FAKE_POLICY` env vars (QA harness + future fake adapters). Values round-trip through export_to_env / from_env and are surfaced in /api/health.flags for the Playwright probe. - services/registry_models.py: draft Pydantic v2 schema for RobotEntry / CameraEntry / TeleopEntry / RegistryDocument shared with robotics-integrator and frontend before Task huggingface#5 implementation lands. Tagged-union Connection (serial | network), open-ended robot_type str with KnownRobotType Literal for UI dropdown reuse. - Test: verify fake flags propagate through create_app.

Pre-work scaffold for Task huggingface#9 while its blockers (Tasks huggingface#2 FastAPI, huggingface#5 registry, huggingface#6 adapter) land. Validates: - BGR ndarray -> PyAV VideoFrame conversion with monotonic pts - LeRobotCameraTrack fallback to a black frame when the source fails - RTCPeerConnection offer/answer handshake end-to-end - Sender codec preference narrowing and getStats() reporting Finding: aiortc 1.14.0 does not ship VP9. The effective codec order is H264 > VP8 (both H264 baseline profile-level-ids advertised). VP9 stays in CODEC_PREFERENCE so a future aiortc release is picked up automatically; today's negotiation falls back to H264. Tests live under tests/dashboard/ and skip when aiortc is absent.

…uggingface#2) The CLI was instantiating ``DashboardConfig()`` as the fallback source and then overwriting ``static_dir`` with ``None`` whenever ``--static-dir`` was absent. ``export_to_env`` compounded the issue by ``pop``-ing the env variable so external callers (systemd, docker compose) couldn't rely on env-driven configuration at all. Fixes: - ``_config_from_args`` now bases itself on ``DashboardConfig.from_env()`` so env-set values survive when the matching CLI flag is omitted; CLI values still take precedence when present. ``fake_devices`` / ``fake_policy`` are also propagated from the env base. - ``export_to_env`` no longer pops ``LEROBOT_DASHBOARD_STATIC_DIR`` when ``static_dir`` is None. Only populated values are written. Adds ``tests/dashboard/test_cli.py`` covering env-only, CLI-overrides-env, export-preserves-existing-env, and fake-flag propagation. Reported by qa-engineer during E2E bring-up. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chaofiber

Review

Positives

Huge DX improvement — The old setup required cloning 3 repos, running setup.py develop on each, manually pip install-ing ~12 packages, and a separate Hydra fix. Poetry reduces this to poetry install. Meaningful reduction in onboarding friction.
Reproducible builds — The lockfile pins exact versions, which the previous workflow did not.
Clean README rewrite — Concise instructions with a useful troubleshooting tip for disk space issues.

Issues & Concerns

1. Dual `diffusion_policy` installation paths

The README instructs users to:

git clone https://github.com/real-stanford/diffusion_policy
cp -r diffusion_policy/diffusion_policy $(poetry env info -p)/lib/python3.10/site-packages/

But pyproject.toml also declares diffusion-policy as a git dependency. Two conflicting mechanisms for the same package — confusing and fragile. The cp hack is unversioned, invisible to the package manager, and hardcodes python3.10. Pick one approach and remove the other.

2. Dependency issues in `pyproject.toml`

tensordict pins to HEAD of main (will drift silently), while torchrl pins a specific commit (13bef426). Both should pin commits or track tagged releases for reproducibility.
mujoco-py = "^2.1.2.14" + mujoco = "^3.1.2": mujoco-py is the old deprecated binding requiring a MuJoCo license; mujoco is the new official package. Having both is contradictory and can cause conflicting libc requirements.
gym = "^0.26.2": By 2024, gymnasium had replaced gym as the maintained fork. gym is a source of deprecation warnings and API incompatibilities with modern RL tooling.
No optional/extra groups: All deps (MuJoCo, pygame, GPU-specific torch) are mandatory. Poetry supports dependency groups — users who only need dataset loading shouldn't be forced to install heavy simulation deps.

3. Python classifier is wrong

classifiers=[
    "Programming Language :: Python :: 3.8",
]

But python = "^3.10". The classifier claims 3.8 support that doesn't exist.

4. Minor: 3 trailing blank lines before `[build-system]` in `pyproject.toml`

Overall the core decision to adopt a package manager was clearly correct and this was a good stepping stone for the project.

…gingface#2) The `XelaTactileConfig.use_calibrated` flag captures XCAL forces into the sensor's internal `_latest_cal` buffer but v1 never plumbs them through `async_read()` or emits an `observation.tactile.<name>.cal` sibling key. Setting `use_calibrated=True` therefore had no observable effect on the recorded dataset — but the README, spec, and config docstring all implied that it did. Honest fix (option 3 from the review): - Keep the field for forward compatibility with a planned v2. - Emit a one-shot WARNING at `XelaTactileSensor.__init__` when set. - Update the config docstring, spec ("v2 only — not yet shipped"), and bi_yam README to be explicit that v1 records raw uint16 only and ignores the flag. Two new tests pin both branches: warning fires on True, silent on False. 48/48 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the v1 no-op warning approach with the actual implementation the README has been promising. PR #1 review item huggingface#2, option 1. API additions on TactileSensor: - `provides_calibrated: bool` property (default False) — backends opt in. - `async_read_calibrated() -> NDArray[float32]` — returns calibrated vector of the same shape as `async_read()`. Default raises NotImplementedError. XelaTactileSensor: - `provides_calibrated` returns `config.use_calibrated`. - `async_read_calibrated()` mirrors async_read()'s wait-for-first-frame semantics, returns `_latest_cal.copy()` when present, or zeros plus a one-shot ERROR log when the server frame's `calibrated` field is null (XCAL files not installed). Keeps dataset schema consistent across the episode regardless of XCAL state. MockTactileSensor: - New `MockTactileConfig.provides_calibrated: bool = False` config field. - When set, `async_read_calibrated()` emits a cosine waveform scaled to ~0.001 — distinguishable from the sine raw signal so consumers can verify the calibrated path is wired correctly without hardware. BiYamFollower: - `_tactile_ft` adds `observation.tactile.<name>.cal` for sensors with `provides_calibrated=True` (same shape as the raw vector). - `get_observation()` calls `sensor.async_read_calibrated()` and emits the .cal column when the sensor advertises it. Tests (8 new): - xela: provides_calibrated default False, returns XCAL floats when server delivers them, zero-fills + one-shot ERROR when XCAL missing, raises NotImplementedError when use_calibrated=False. - mock: provides_calibrated default False, returns cosine when enabled. - bi_yam: observation_features includes .cal when provided, get_observation emits .cal column, no .cal when not provided. Docs (config docstring + bi_yam README + spec) re-written to describe the implemented behavior, including the XCAL-missing fallback semantics. 54/54 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…uggingface#2 follow-up) The other reviewer observed that `_cal_missing_logged` is a per-sensor- lifetime guard, which silences the second outage when XCAL flaps: present → missing (logs once) → present → missing (silent). Reset the guard in `_on_message` whenever a non-null `calibrated` field arrives, so each missing stretch logs exactly one ERROR. Two-line fix. Regression: tests/tactile/test_xela_sensor.py:: test_async_read_calibrated_warning_rearms_after_xcal_recovers drives the present → missing → present → missing sequence and asserts exactly two ERROR records. 55/55 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: add XELA tactile integration design spec Brainstormed design for plumbing a single XR1944 tactile pad on the right follower's parallel gripper into BiYamFollower observations via a new src/lerobot/tactile/ subsystem (parallel to cameras/), with a WebSocket client backend for XELA Server v1.7.6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: add XELA tactile integration implementation plan Sequenced TDD plan derived from the design spec. 13 tasks covering deps + import flag, the new src/lerobot/tactile/ subsystem (configs, ABC, factory), MockTactileSensor for CI, XelaTactileSensor with a WebSocket reader thread + last-good-frame failure mode, BiYamFollower integration, the run_xela_server.py operator wrapper, and docs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(tactile): add websocket-client dep and import-availability flag Add `_websocket_client_available` to import_utils for guarded XELA imports and append `websocket-client` to the `[yam]` optional-dependency extra. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tactile): add TactileSensorConfig and TactileSensor ABC New top-level subsystem `src/lerobot/tactile/` mirroring `cameras/`. Provides a draccus.ChoiceRegistry-based config base and an abstract `TactileSensor` class with the lifecycle (connect/disconnect), shape/dtype properties, and async_read() contract that backends must implement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tactile): add make_tactile_sensors_from_configs factory Lazy-imports backends so optional deps (e.g., websocket-client for XELA) only fail at the point of use, not at module import. Dispatches on `cfg.type` for 'mock' and 'xela'. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tactile): add MockTactileSensor backend for CI Deterministic-per-seed sinusoidal frame generator with no I/O. Lets BiYamFollower integration tests run in CI without XELA hardware. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tactile): add XelaTactileConfig with model->shape registry Defaults to XR1944 (16 taxels x 3 axes = 48 channels) and exposes expected_shape inferred from the XELA model name. Known-model registry covers XR1944, XR1946, XR2244, XR1922. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tactile): add pure XELA WebSocket frame parser Pure functions parse_hex_csv, is_welcome, and parse_frame for the XELA Server v1.7.x JSON wire format (manual p. 37). Returns a frozen ParseResult with seq, timestamp, raw float32 and optional calibrated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tactile): add XelaTactileSensor WebSocket client backend Daemon thread holds a WebSocketApp open to xela_server, parses each frame into a (48,) float32 vector, and stores it in a single-slot buffer. async_read() returns the latest frame; on transport failures it returns the last-good-frame and reconnects with capped exponential backoff. Optional XCAL-calibrated forces and connect-time tare. Includes a `python -m lerobot.tactile.xela.xela_tactile` smoke-test CLI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(bi_yam): wire tactile sensors into BiYamFollower Add `tactile_sensors: dict[str, TactileSensorConfig]` to BiYamFollowerConfig (defaults to empty dict, opt-in via the same draccus dispatch as cameras). BiYamFollower constructs sensors via make_tactile_sensors_from_configs, connects/disconnects them alongside arms and cameras, exposes their shapes in observation_features, and emits `observation.tactile.<name>` keys in get_observation(). The wiring is non-intrusive: with no tactile_sensors configured (the default) all checks short-circuit and behavior is identical to before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(bi_yam): add run_xela_server.py operator wrapper Operator launcher that wraps `/etc/xela/xela_server` with sensible defaults (--ip 127.0.0.1, --port 5000, --noros). Mirrors the lifecycle pattern of run_bimanual_yam_server.py: external process, leave running in its own terminal, LeRobot connects as a client. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(tactile): add XELA backend README and bi_yam tactile section XELA backend README documents supported models, the WebSocket wire protocol (manual p. 37), bring-up checklist, and failure-mode behavior. bi_yam_follower README gains a Tactile Sensor section with the per-boot slcan setup, per-session three-terminal layout, the recorded keys schema, and the <arm>_finger_<side> naming convention for adding more sensors later. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tactile): use long --ip flag for xela_server (short -i is ignored) Hardware smoke test surfaced that xela_server v1.7.6's AppImage CLI silently ignores the short `-i` form and falls back to its default of binding to the first NIC IP (e.g., 192.168.x.x). Switching to the long `--ip 127.0.0.1` form per the manual (p. 20) makes the daemon honor the local-only bind we want. Updates run_xela_server.py and the README/spec/plan invocations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tactile): default XELA host to 'auto' (resolves to LAN IP) Smoke test on real hardware revealed that xela_server v1.7.6 build 158509 silently ignores the --ip flag and always binds to the host's primary NIC IP (e.g., 192.168.x.x). Defaulting host to '127.0.0.1' was therefore incorrect — the connection would fail unless the user knew to pass the LAN IP explicitly. Fix: default `XelaTactileConfig.host` to "auto" and resolve at connect time using the same heuristic xela_server uses (UDP socket trick to discover the route to a public IP). Existing IPs/hostnames pass through unchanged. Also drop the misleading --ip pass-through from run_xela_server.py. Adds a thread-safety poll helper to the test suite to avoid a race between the test thread and the daemon reader thread now that _resolve_host runs in-thread. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tactile): log graceful WS close at INFO, not WARNING Hardware smoke test surfaced a noisy `WARNING: XELA WS closed (status=None, msg=None)` printed every time disconnect() runs cleanly. Demote that path to INFO so warnings stay reserved for unexpected transport drops (the reconnect-worthy case). Decision is keyed off `self._stop.is_set()`: if disconnect() flipped the stop flag before the close, it's our own teardown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(tactile): add inspect_tactile_dataset.py utility Five modes for examining a recorded dataset: - summary — schema + per-axis (X, Y, Z) min/max/mean/std (no GUI, CI-friendly) - timeseries — line plot of all 16 taxels split by axis over the episode - heatmap — animated 4×4 force-magnitude grid (|Δ from baseline|) - frame — single-frame side-by-side: gripper camera + tactile heatmap - all — run all four in sequence Auto-discovers tactile keys from the dataset (`observation.tactile.*`) so it works for any sensor name and forward-compatibly with multi-sensor setups. Companion README documents the CLI and the dependency on matplotlib. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(tactile): sync all READMEs and spec to post-implementation reality Aligns documentation with three corrections discovered during hardware smoke testing: 1. `XelaTactileConfig.host` default is "auto" (not "127.0.0.1") — the xela_server AppImage v1.7.6 silently ignores --ip and always binds to the LAN IP, so the client must resolve to the same address. 2. `lerobot-record` and bring-up examples drop the misleading `host: "127.0.0.1"` field and `--ip` flag, with explanatory comments replacing each removed flag so future readers understand why. 3. Adds explicit "how to terminate xela_server" sections to both READMEs (kill $! / pkill -f / pkill -9) since `Ctrl+C` doesn't work for backgrounded processes. Plus cross-references to `examples/tactile/inspect_tactile_dataset.py` from both READMEs, and a "Post-implementation amendments" appendix at the end of the plan that captures the three corrections + the inspection utility addition with their commit shas, so the historical task code blocks remain truthful records of plan-time intent rather than being quietly rewritten. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(tactile): regenerate uv.lock with websocket-client Reconciles uv.lock with the websocket-client>=1.7,<2.0 dep added to the [yam] extra in 55ef5b1. `uv lock --check` now passes; future `uv sync --locked --extra yam` runs from a clean checkout will resolve to the same 355-package graph that was tested on this branch. Note: the lockfile grew from 1.9 MB → 87 MB because uv expands resolution-markers combinatorially across (Python version × platform × sys_platform × extra) for projects with many extras (lerobot has 14+). This is normal for this project — the size scales with extras count, not with the websocket-client addition. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tactile): pre-import backend configs for ChoiceRegistry registration XelaTactileConfig and MockTactileConfig only register with TactileSensorConfig's draccus ChoiceRegistry when their modules are imported, but the only previous import path was lazy (inside make_tactile_sensors_from_configs, which runs after CLI parsing). As a result, --robot.tactile_sensors='{...: {type: xela, ...}}' failed at parse time with "Couldn't find a choice class for 'xela'". Eagerly import both backend configs in lerobot.tactile.__init__ so the @register_subclass decorators run at package load. Sensor classes (which may pull in optional deps such as websocket-client) stay lazy via make_tactile_sensors_from_configs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(features): emit 1D non-image hardware features as vector observations hw_to_dataset_features classified ANY tuple-shaped feature as a camera and emitted it under f"{prefix}.images.{key}" with dtype "video". 1D vector observations (e.g., tactile arrays of shape (48,)) therefore got a mangled key and were routed to the image/video validator, which crashed at frame time trying to unpack the 1D shape into (c, h, w). - hw_to_dataset_features: split tuple-shaped features by rank — len(shape)==3 stays the camera/video path; len(shape)==1 becomes a vector feature emitted unchanged under its original key with dtype "float32". - build_dataset_frame: for 1D float32 features, prefer values[key] when the full key is present (pre-packed array, e.g., tactile) and fall back to scalar-gather-by-name otherwise (existing observation.state behavior). Verified end-to-end: joint state, camera, and tactile features all produce correct dataset specs and pass validate_frame on a fake observation dict. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(bi_yam): use --dataset.fps in lerobot-record examples lerobot-record exposes the FPS option as --dataset.fps; passing top-level --fps fails argparse with "unrecognized arguments". The README's lerobot-record and lerobot-record-with-depth examples used the broken --fps=30. Five blocks fixed (palm, RealSense, torque, depth, tactile). The four lerobot-teleoperate / lerobot-teleoperate-with-depth blocks (and the text recommendation around them) are left as --fps=15/30 — those scripts do expose --fps as a top-level flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(bi_yam): add tactile recording subsection to Step 3 Step 3 already documented camera, torque, and depth recording variants but omitted tactile, even though the dedicated Tactile Sensor section further down covered the standalone case. Adds a "With Tactile Sensor (XELA, optional)" subsection alongside the others, with: - A prerequisite blockquote pointing to the existing Tactile Sensor section for one-time setup, multi-sensor naming, calibrated forces, and how to stop xela_server. - A tactile-only command (sanity-check / touch-only policies). - A tactile + RealSense command (typical contact-rich manipulation), verified end-to-end against hcisbmm/bimanual-yam-tactile-vision-demo on the Hub. - A note on the recorded key shape/dtype with a cross-link back to the Tactile Sensor section for inspection tooling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(tactile): expand bi_yam + XELA READMEs with full tactile coverage Two rounds of tactile-doc work folded into one commit. bi_yam_follower README — tactile context integrated into every relevant section: - Overview lists the optional XELA pad alongside arms + CAN. - Hardware Setup gains an "Optional: XELA Tactile Sensor" subsection that flags the VScom dual-USB requirement and links to the new XELA backend "First-time setup". - Step 1 (server bring-up) gains an "Optional — start xela_server" callout. - Step 2 gains a new "Step 2.4: Test Tactile Sensor (optional)" (Login step bumped to 2.5). - Configuration Parameters documents robot.tactile_sensors. - Architecture diagram + Server Process Details add the XELA Server lane. - Troubleshooting gains a "Tactile Sensor Issues" subsection plus a warmup-FPS note in "Slow Control Loop". - References point to the XELA backend README, the inspect script, and the vendor manual. - "One-time setup (per boot)" renamed to "Per-boot setup" (the original heading was misleading — the steps run every boot) with a pointer to the backend README's truly-once install steps. XELA backend README (src/lerobot/tactile/xela/README.md) — vendor canon imported from the XELA Software Manual v1.7.6 and the vendor Notion setup guide: - New "First-time setup (one-time per machine)" with hardware notes (VScom + dual USB), can-utils install, /etc/xela directory creation with 777 perms, unpacking appimage.zip, PATH setup, and the interactive xela_conf y/Enter prompt. - "Bring-up checklist" renamed to "Bring-up checklist (per boot)" with a new "What a healthy server looks like" subsection covering the Server: ~95Hz / Users: 0->1 status output as a pre-flight signal. - New "Optional vendor tools" section documenting xela_viz (live GUI visualization) and xela_log (offline data logger), the two appimage.zip binaries previously undocumented. - Failure modes table split into LeRobot client side + Vendor binary side, with new rows from manual §"Common errors" (DSUB-9 loose, slCAN net not up, /etc/xela perms, CAN-USB compat checks, Ctrl+C unresponsive). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(tactile): add subsystem README + rectify cross-doc consistency Three issues found in a holistic review of the fork's tactile READMEs: 1. The src/lerobot/tactile/ directory had no parent README — anyone landing in that subdir from a code search hit only Python files. Adds an orienting README that documents the subsystem layout, the data contract (1-D float32 under observation.tactile.<name>, in contrast to images and bundled state), recorded-key naming conventions, and how to add a new sensor backend (config + sensor class + eager config import for ChoiceRegistry registration). 2. The bi_yam dedicated "Tactile Sensor" section's recording example used the multi-line YAML form of --robot.tactile_sensors, while Step 2.4 and Step 3 (in the same README) use the compact form that was end-to-end verified on hcisbmm/bimanual-yam-tactile-demo and hcisbmm/bimanual-yam-tactile-vision-demo. Harmonized to the compact form, and dropped the inline --ip-behavior comment in favor of a pointer to the XELA backend README which already owns that detail. 3. The bi_yam dedicated "Stopping xela_server" subsection duplicated the more comprehensive table in the XELA backend README (which covers kill $!, pkill -f, pkill -9, and fg-then-Ctrl+C). Replaced the duplicate with a one-line link to keep a single source of truth. 4. examples/tactile/README.md now cross-links to the XELA backend README's first-time setup (the install-time prerequisite that the inspect script's recording side depends on), and the dependencies note now names the matplotlib-dep extra explicitly. Upstream READMEs (root README.md, AGENT_GUIDE.md, docs/, docker/, benchmarks/, all 11 policy READMEs) were reviewed and found to have no references to fork-specific content — left untouched to avoid creating upstream merge friction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(tactile): document the (N,) → (rows, cols, 3) reshape contract The flat (N,) storage shape is a pipeline choice (feature_utils dispatches 3-tuples to the video-encoding bucket, which would fail-route a (4,4,3) tactile tensor through h264/hevc). The data still preserves the sensor's 2D taxel grid and per-taxel (X, Y, Z) force-axis structure — consumers recover it via flat.reshape(rows, cols, 3), with row-major order matching the XELA wire format ("top-left towards right, line-by-line", manual p. 36). Adds a "Recovering the 2D spatial layout" subsection to tactile/README.md covering: the exact reshape recipe, which downstream policy types benefit from it (CNN / ViT encoders) and which don't (state-MLP encoders), and why storage stays flat until LeRobot grows a multi-dim numeric feature bucket. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tactile): address PR #1 review — seq-reset, ordering, defensive logs Four fixes from the parallel-session PR review of PR #1: 1. **Reset _last_seq per connection in _reader_loop** (must-fix). The sequence-monotonicity guard accumulated across reconnects, so an operator-restarted xela_server (whose seq counter starts back at 0) would have all its frames silently dropped as "out-of-order", freezing async_read() at the last pre-disconnect frame indefinitely. Resetting per connection lets the new server's stream resume cleanly. New regression: tests/tactile/test_xela_sensor.py::test_seq_reset_on_reconnect. 2. **Move self._connected = True to AFTER thread.start()** in XelaTactileSensor.connect() (defensive). If thread spawn raised, the sensor would have lied to BiYamFollower's is_connected check while no reader was actually running. 3. **Warn on _resolve_host("auto") fallback to 127.0.0.1** (UX). The previous silent fallback meant an air-gapped operator would see a confusing TimeoutError("No XELA frame received…") 1 s later and blame xela_server. The new WARNING points at the real cause and suggests setting host explicitly. 4. **Warn on unsupported tuple shapes in hw_to_dataset_features** (defensive). The function silently drops tuple shapes that aren't length 1 or 3 — a future multi-dim numeric tactile feature (e.g. (rows, cols, 3)) would vanish from the dataset with no error. Now emits a clear logger.warning identifying the dropped keys and pointing at the right extension hook. Plus a new tests/tactile/test_xela_sensor.py::test_stale_frame_warning_after_idle that covers the existing >1s staleness warning path. 46/46 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(tactile): mark use_calibrated as v1 no-op (PR #1 review item huggingface#2) The `XelaTactileConfig.use_calibrated` flag captures XCAL forces into the sensor's internal `_latest_cal` buffer but v1 never plumbs them through `async_read()` or emits an `observation.tactile.<name>.cal` sibling key. Setting `use_calibrated=True` therefore had no observable effect on the recorded dataset — but the README, spec, and config docstring all implied that it did. Honest fix (option 3 from the review): - Keep the field for forward compatibility with a planned v2. - Emit a one-shot WARNING at `XelaTactileSensor.__init__` when set. - Update the config docstring, spec ("v2 only — not yet shipped"), and bi_yam README to be explicit that v1 records raw uint16 only and ignores the flag. Two new tests pin both branches: warning fires on True, silent on False. 48/48 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tactile): implement calibrated reading path (.cal sibling column) Replaces the v1 no-op warning approach with the actual implementation the README has been promising. PR #1 review item huggingface#2, option 1. API additions on TactileSensor: - `provides_calibrated: bool` property (default False) — backends opt in. - `async_read_calibrated() -> NDArray[float32]` — returns calibrated vector of the same shape as `async_read()`. Default raises NotImplementedError. XelaTactileSensor: - `provides_calibrated` returns `config.use_calibrated`. - `async_read_calibrated()` mirrors async_read()'s wait-for-first-frame semantics, returns `_latest_cal.copy()` when present, or zeros plus a one-shot ERROR log when the server frame's `calibrated` field is null (XCAL files not installed). Keeps dataset schema consistent across the episode regardless of XCAL state. MockTactileSensor: - New `MockTactileConfig.provides_calibrated: bool = False` config field. - When set, `async_read_calibrated()` emits a cosine waveform scaled to ~0.001 — distinguishable from the sine raw signal so consumers can verify the calibrated path is wired correctly without hardware. BiYamFollower: - `_tactile_ft` adds `observation.tactile.<name>.cal` for sensors with `provides_calibrated=True` (same shape as the raw vector). - `get_observation()` calls `sensor.async_read_calibrated()` and emits the .cal column when the sensor advertises it. Tests (8 new): - xela: provides_calibrated default False, returns XCAL floats when server delivers them, zero-fills + one-shot ERROR when XCAL missing, raises NotImplementedError when use_calibrated=False. - mock: provides_calibrated default False, returns cosine when enabled. - bi_yam: observation_features includes .cal when provided, get_observation emits .cal column, no .cal when not provided. Docs (config docstring + bi_yam README + spec) re-written to describe the implemented behavior, including the XCAL-missing fallback semantics. 54/54 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tactile): re-arm XCAL-missing warning on recovery (PR #1 review huggingface#2 follow-up) The other reviewer observed that `_cal_missing_logged` is a per-sensor- lifetime guard, which silences the second outage when XCAL flaps: present → missing (logs once) → present → missing (silent). Reset the guard in `_on_message` whenever a non-null `calibrated` field arrives, so each missing stretch logs exactly one ERROR. Two-line fix. Regression: tests/tactile/test_xela_sensor.py:: test_async_read_calibrated_warning_rearms_after_xcal_recovers drives the present → missing → present → missing sequence and asserts exactly two ERROR records. 55/55 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style(tactile): satisfy ruff lint + format (PR #1 CI pre-commit) PR #1's pre-commit CI flagged 19 ruff lint errors and 11 format mismatches across files this PR added. None affect behavior; all are style-only. Lint fixes (in files this PR added): - I001 (import sort) src/lerobot/tactile/__init__.py - E402 (import after pytest.importorskip) tests/tactile/test_xela_sensor.py resolved with explicit `# noqa: E402` + a comment explaining the intentional ordering. - E702 (semicolon-separated statements) split across lines in tests/tactile/test_mock.py and examples/tactile/inspect_tactile_dataset.py - N806 (uppercase var) renamed `T` → `n_frames` in examples/tactile/inspect_tactile_dataset.py (and downstream uses) - F541 (f-string without placeholders) fixed in same file - SIM102 (nested ifs) collapsed in src/lerobot/tactile/xela/xela_tactile.py - SIM105 (try/except/pass) replaced with `contextlib.suppress` in src/lerobot/tactile/xela/xela_tactile.py - SIM108 (if/else assignment) collapsed to ternary in src/lerobot/tactile/xela/parser.py Format fixes: ran `ruff format` on this PR's files. Pre-existing non-PR-touched files (e.g. run_bimanual_yam_server.py) left alone. Verified: ruff check clean, ruff format --check clean, 55/55 tests pass. Pre-commit's other failure flagged in PR CI (Fast Tests: i2rt path-dep build failure) is a fork-infrastructure issue (CI doesn't init git submodules), pre-existing on main with 3 prior identical failures, and not related to this PR's code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style(tactile): pass full pre-commit suite (typos config + prettier reflow) The previous lint commit fixed only `ruff check` issues with my local ruff 0.15.12. CI runs the full pinned pre-commit suite (ruff v0.14.1 plus typos, prettier, trailing-whitespace, end-of-file-fixer, bandit, mypy), and exposed three more failure paths in this PR's files: 1. **typos** flagged `NDArray` (numpy.typing) as a misspelling of `AND` because the parser sees `ND` as a standalone token. Adding `NDArray` to `[tool.typos] default.extend-ignore-identifiers-re` is the project-wide fix — also helps any future PR that uses NDArray. 2. **typos** also flagged three pre-existing-but-touched typos this branch had to reformat: `mis-classified` → `misclassified`, `Recommand` → `Recommended`, `moniter` → `monitor`. Two were pre-existing in `bi_yam_follower/README.md`; my prior tactile section addition triggered prettier to reflow the whole file, which re-exposed them. Trivial spelling fixes. 3. **prettier (markdown)** + **trailing-whitespace** + **end-of-file** auto-reflowed the four READMEs (tactile/, tactile/xela/, examples/tactile/, bi_yam_follower/) plus the two long superpowers docs (spec + plan) and the bi_yam_tactile test file. Pure formatting, no behavioral change. Pre-commit hooks now passing cleanly on this PR's files: trailing- whitespace, end-of-files, ruff-format, ruff (legacy alias), typos, prettier, pyupgrade, secret-detection, zizmor — 14 hooks green. Three CI hooks remain failing on PR #1, but their violations are exclusively in PRE-EXISTING files not touched by this PR (verified via `git diff --name-only origin/main..HEAD`): - ruff: visualization_utils.py, mujoco_scene_builder.py, test_mujoco_torque_visualizer.py, lerobot_record_with_depth.py - bandit: visualization_utils.py, mujoco_scene_builder.py - mypy: camera_opencv.py These are pre-existing issues on `main` itself (verified via `gh run list --branch main --workflow="Quality"` returning prior failures with the same signatures). Not in scope for this PR; appropriate to fix in a separate hygiene PR. 55/55 tactile + bi_yam tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci(quality): scope pre-commit to PR diff (avoids inheriting main's debt) The Quality workflow was running `pre-commit run --all-files` on every PR, which surfaced pre-existing lint/typing/security debt elsewhere in the repo (mujoco_scene_builder.py, visualization_utils.py, camera_opencv.py, etc.) on each PR's check rollup. This made the CI signal noisy: PRs that introduced no new debt were marked red because of issues they didn't touch. Switch to `--from-ref ${pr.base.sha} --to-ref ${pr.head.sha}` on pull_request events so the hooks lint only the files changed in the PR. Direct pushes to main keep `--all-files` so main's quality bar stays unchanged and any inherited debt is still surfaced when someone commits directly to main. Also bump checkout `fetch-depth: 0` so the full history is available locally for pre-commit's diff computation (default fetch-depth=1 only includes HEAD, which would make --from-ref unable to walk back to the base sha). PR #1's previous CI red was entirely from this issue: every failing file was verified pre-existing on origin/main via `git log origin/main..HEAD -- <file>` returning empty. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(bi_yam): document resume mode (--resume + --dataset.root) in Step 3 Adds a "Resuming an existing dataset" subsection at the end of Step 3, covering the two non-obvious requirements that bit during a real session: - --resume=true requires an explicit --dataset.root because LeRobotDataset.resume() refuses to write into the revision-safe Hub snapshot cache (which would corrupt the shared cache). - --dataset.num_episodes in resume mode is the count of episodes added in THIS invocation, not the new total. lerobot_record.py initialises recorded_episodes=0 every run regardless of resume state. The dedicated "Tactile Sensor (XELA, optional)" Per-session subsection now cross-links to this new resume subsection alongside the existing pointer to "With Tactile Sensor". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(tactile): document --root vs Hub cache for inspect after resume A real session ran into "Instruction 'train' corresponds to no data!" when inspecting episodes that were freshly appended via resume — the inspect script defaults to the revision-safe Hub snapshot cache, which doesn't update until the user explicitly re-fetches. The local --dataset.root from the resume call holds the freshly-written episodes and is what should be inspected during/after the session. bi_yam_follower README — "Resuming an existing dataset" subsection now includes an "Inspecting after resume — --root matters" callout with both the recommended (--root) command and the cache-refresh recipe (rm -rf ~/.cache/huggingface/lerobot/<repo>). examples/tactile/README.md — adds a "When to use --root vs the default cache" section explaining why the cache can lag, what the symptom looks like ("Instruction 'train' corresponds to no data!"), and how to refresh it. Cross-links back to the bi_yam resume subsection for the full recording workflow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(tactile): revert --resume + --root recipe; recommend fresh repo_id Real sessions surfaced two bugs in the native --resume=true + push pipeline that make the previously-recommended --root recipe unsafe in practice: 1. Video timestamp drift in newly-encoded chunks. When resume opens a new chunk file (e.g. videos/.../file-001.mp4) the encoder's PTS clock can be skewed by ~1 frame at the chunk boundary. LeRobotDataset's default tolerance_s=1e-4 is far tighter than that drift, so cold random-access reads (training data loaders, inspect_tactile_dataset.py's frame mode) raise FrameTimestampError on resumed-chunk frames. 2. Original episodes' data files can go missing on Hub. We observed the resume push upload only the new chunk's parquet+MP4s without preserving the previous chunk's data files. The metadata still claims those episodes exist, but their rows are unreadable — silent data loss. Both bugs live in the resume + push pipeline upstream, not in this README. Until they're fixed, the safe recommendation is to record into a fresh --dataset.repo_id (e.g. -v2 suffix) and merge offline if a single combined dataset is needed. Reverts in this commit: - bi_yam_follower/README.md: replaces "Resuming an existing dataset" subsection (with --root recipe + cache-refresh + inspecting-after-resume callout) with a concise "Adding more episodes to an existing dataset" subsection that recommends a fresh --dataset.repo_id and documents the two known bugs as the rationale. - bi_yam_follower/README.md: updates the cross-link in the dedicated Tactile "Per session" subsection to the new anchor. - examples/tactile/README.md: strips the resume-specific framing from --root flag documentation. Keeps --root in the Flags list as a generic "read from local root" option (still useful for non-resume cases like pre-publish datasets and local snapshots) but removes the "When to use --root vs the default cache" section that pitched it as the resume solution. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…resh Three small frontend bugs surfaced during user testing: #1 Inspector cards include action/observation.* even though those are read-only — pure clutter in the selection-edit surface. Filter to isEditable() only and show an empty-state hint when a selection would have no editable cards. #2 Single-frame selections (k = 1) read "frames 269…269 (1 frame)" — awkward redundant phrasing. Switch to "frame 269" for k = 1, keep "frames N…M (K frames)" for k > 1. huggingface#4 After a successful Save the row line plots kept rendering the pre- edit values because seriesCache held them. Watch pendingEdits for the > 0 → 0 transition (Save or Discard) and drop the cached series for the current dataset, then re-fetch + re-render for the current episode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two bugs combining to make the brand-new ``_tool3`` dataset unloadable: 1. ``lerobot_annotate.py:_push_to_hub`` uploads the annotated dataset folder but never creates a codebase-version tag, so ``api/datasets/<repo>/refs`` returns ``"tags": []``. Then ``LeRobotDatasetMetadata`` → ``get_safe_version`` → ``get_repo_versions`` returns empty and the loader raises ``RevisionNotFoundError``. 2. ``RevisionNotFoundError`` itself was unconstructible: its ``HfHubHTTPError.__init__`` indexes ``response.headers`` unconditionally on current ``huggingface_hub`` versions, so constructing it without a real ``Response`` blew up with ``AttributeError: 'NoneType' object has no attribute 'headers'``, masking the real "no tag" message. Fix #1: after upload, read ``meta/info.json["codebase_version"]`` and ``HfApi.create_tag(..., tag=<v3.x>, repo_type='dataset', exist_ok=True)`` so the dataset is loadable straight from the Hub on the next ``LeRobotDataset(repo_id)`` call. Falls back to the in-tree ``CODEBASE_VERSION`` if info.json is missing/malformed; on tag creation failure, prints the manual one-liner the user needs. Fix #2: stop trying to instantiate ``RevisionNotFoundError`` (which inherits HfHubHTTPError) for what is really a config issue, not an HTTP failure. Raise plain ``RuntimeError`` with the same message — the caller actually sees what's wrong instead of an upstream attribute error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ests * **#2 — dedupe `_PLACEHOLDER_RE`.** The same regex was compiled in `recipe.py` and `language_render.py`. Promote to module-level `PLACEHOLDER_RE` in `recipe.py` (its primary owner — declares template syntax) and import from `language_render.py`. * **#3 — centralize language column names.** `io_utils.py` had hardcoded `{"language_persistent", "language_events"}` literals at two sites. Replace with `LANGUAGE_COLUMNS` import so a future column rename can't silently desync. * **#4 — defensive collate preserved-keys.** `lerobot_collate_fn` silently filtered language fields from samples that didn't have them, which would hand downstream consumers a preserved list shorter than the tensor batch. Now: if any sample carries a key, every sample in the batch must carry it; otherwise raise a `ValueError` so the upstream rendering bug surfaces at the boundary. * **#5 — `_scalar` rejects non-singleton lists.** Previously a zero- or multi-element list fell through and triggered confusing `float([])` errors downstream. Now raises `ValueError` with the actual length. * **#6 — refactor `_extract_complementary_data`.** Replace 11 lines of `key = {... if ... else {}}` plus an 11-line splat dict with a single `_COMPLEMENTARY_KEYS` tuple iterated once. * **#7 — document `EXTENDED_STYLES`.** Was an empty `set()` with no comment. Add a docstring explaining it's an intentional extension point: downstream modules append project-local styles before `column_for_style` is called. * **#9 — `tools.mdx` notes the runtime layer is future work.** The page referenced `src/lerobot/tools/`, `registry.py`, and `get_tools(meta)` — none exist in this PR. Added a callout at the start of "How to add your own tool" plus a note on the implementations paragraph. * **#10 — tests for YAML round-trip, malformed rows, blend validation.** `test_recipe.py` grew from 1 case to 12 covering: blend-or-messages exclusivity, target-turn requirement, blend emptiness, weight presence/positivity, nested-blend rejection, `from_dict` with nested blends, `from_yaml` / `load_recipe` agreement, top-level non-mapping rejection. Added a malformed-row test for `_normalize_rows` that asserts non-dict entries raise `TypeError`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Added poetry setup

d5bfead

aliberts requested a review from Cadene February 28, 2024 18:14

Cadene merged commit afb6b86 into main Feb 28, 2024

aliberts linked an issue Feb 29, 2024 that may be closed by this pull request

Missing access to git repository for dependency #1

Closed

aliberts deleted the install branch March 3, 2024 11:53

alexander-soare pushed a commit that referenced this pull request May 27, 2024

Merge pull request #2 from huggingface/main

23fbc19

Merge from main

RussTedrake pushed a commit to RussTedrake/lerobot that referenced this pull request Jul 8, 2024

Merge pull request huggingface#2 from notmahi/mahi/mlpbc

45c06b7

[WIP] Added MLPBC policy

menhguin pushed a commit to menhguin/lerobot that referenced this pull request Feb 9, 2025

Merge pull request huggingface#2 from Cadene/install

a7881d2

Added poetry setup

catebros added a commit to catebros/lerobot that referenced this pull request Oct 9, 2025

isolating problem huggingface#2

0ca67a3

sl628 pushed a commit to vincewu51/lerobot that referenced this pull request Oct 21, 2025

Merge pull request huggingface#2 from bingogome/ar_controller_teleop

91422b3

Initial supports for ar teleop

LaFeuilleMorte mentioned this pull request Apr 8, 2026

Inference with ACT with relative action not working well #3312

Open

3 tasks

Dev-Jahn added a commit to IISLAB-VLA/lerobot that referenced this pull request Apr 14, 2026

Merge dash-backend: CLI env-override fix (task huggingface#2 followup)

ec00d86

This was referenced Apr 14, 2026

Feat/decouple record script #3380

Closed

refactor(imports): enforce guard pattern #3382

Merged

chaofiber reviewed Apr 19, 2026

View reviewed changes

claude Bot mentioned this pull request Apr 20, 2026

Reward models refactor #3142

Merged

4 tasks

claude Bot mentioned this pull request May 6, 2026

Add extensive language support #3467

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added poetry setup#2

Added poetry setup#2
Cadene merged 1 commit intomainfrom
install

aliberts commented Feb 28, 2024

Uh oh!

chaofiber left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aliberts commented Feb 28, 2024

Uh oh!

chaofiber left a comment

Choose a reason for hiding this comment

Review

Positives

Issues & Concerns

1. Dual diffusion_policy installation paths

2. Dependency issues in pyproject.toml

3. Python classifier is wrong

4. Minor: 3 trailing blank lines before [build-system] in pyproject.toml

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. Dual `diffusion_policy` installation paths

2. Dependency issues in `pyproject.toml`

4. Minor: 3 trailing blank lines before `[build-system]` in `pyproject.toml`