Skip to content

feat(export): add modular policy export with ONNX and OpenVINO runtime#3493

Open
samet-akcay wants to merge 16 commits intohuggingface:mainfrom
samet-akcay:feat/policy-export
Open

feat(export): add modular policy export with ONNX and OpenVINO runtime#3493
samet-akcay wants to merge 16 commits intohuggingface:mainfrom
samet-akcay:feat/policy-export

Conversation

@samet-akcay
Copy link
Copy Markdown

@samet-akcay samet-akcay commented May 1, 2026

TL;DR

This PR adds a modular export path for LeRobot policies. I limited the scope to two different policy families just to show the concept, but the design is scalable to cover all of the policies with different backends.

ACT  -> ONNX package -> ONNX Runtime / OpenVINO runtime
PI05 -> ONNX package -> ONNX Runtime / OpenVINO runtime

The main point is the shape:

raw observation
-> exported runtime preprocessors
-> runner
-> backend runtime session(s)
-> exported runtime postprocessors
-> action

User Examples

from lerobot.policies.act.modeling_act import ACTPolicy
from lerobot.export import load_exported_policy

policy = ACTPolicy.from_pretrained("lerobot/act_aloha_sim_transfer_cube_human")
policy.config.stats = load_or_attach_checkpoint_stats(policy)  # required for normalized export

policy.export("./act_export", backend="onnx", example_batch=example_batch)

exported = load_exported_policy("./act_export", backend="onnx", device="cpu")
action = exported.select_action(raw_observation)
chunk = exported.predict_action_chunk(raw_observation)

OpenVINO uses the same exported package contract:

exported = load_exported_policy("./act_export", backend="openvino", device="CPU")
action = exported.select_action(raw_observation)

PI05 uses the same runtime API, but the policy family is different:

from lerobot.policies.pi05.modeling_pi05 import PI05Policy
from lerobot.export import export_policy, load_exported_policy

policy = PI05Policy.from_pretrained("lerobot/pi05_base", strict=False)

export_policy(
    policy,
    "./pi05_export",
    backend="onnx",
    example_batch=example_batch_with_language_tokens,
    include_normalization=False,
)

exported = load_exported_policy("./pi05_export", backend="onnx")
action_chunk = exported.predict_action_chunk(
    {
        "observation.images.base_0_rgb": image,
        "observation.images.left_wrist_0_rgb": left_wrist_image,
        "observation.images.right_wrist_0_rgb": right_wrist_image,
        "observation.state": state,
        "task": "pick up the red block",
    },
    num_steps=3,
)

PI05 example contract:

export-time example_batch:
  includes language tokens/masks because tracing needs tensor inputs

runtime observation:
  may include raw task string
  tokenize processor emits language tokens/masks inside exported runtime

Package Shape

export_dir/
  manifest.json
  artifacts/
    model.onnx                 # ACT
    encoder.onnx, denoise.onnx # PI05
  stats.safetensors            # when normalization is included
  tokenizer/                   # when tokenizer runtime processor is included

Example executable manifest processors:

{"type": "normalize", "features": ["observation.state"], "artifact": "stats.safetensors"}
{"type": "tokenize", "artifact": "tokenizer"}
{"type": "denormalize", "features": ["action"], "artifact": "stats.safetensors"}

Internal Shape

Backend.open(...) -> _RuntimeSession
Runner.run(...)   -> orchestrates runtime stages
processors        -> torch-free exported pre/post path
manifest          -> package contract
policy decides:
  runner type
  export modules/stages
  processor specs
  assets/stats

backend decides:
  artifact serialization
  runtime session loading
  named stage execution

No extra adapter stack is needed between LeRobot and runtime backends.

Adding A Backend

Pseudo-code for TensorRT:

class _TensorRTRuntimeSession(_RuntimeSession):
    def __init__(self, package_dir, manifest, device):
        self.engines = load_engines(package_dir, manifest, device)

    def run(self, stage_name, inputs):
        return self.engines[stage_name].execute(inputs)


class TensorRTBackend(Backend):
    name = "tensorrt"

    def export(self, module, output_path):
        onnx_path = export_stage_to_onnx(module)
        engine_path = build_tensorrt_engine(onnx_path, output_path)
        return engine_path.name

    def open(self, artifacts_dir, manifest, device):
        return _TensorRTRuntimeSession(artifacts_dir, manifest, device)
register_backend("tensorrt", TensorRTBackend())

Expected extension rule:

new backend = serialize stages + open runtime session + run named stage

This should also fit future backends such as:

ExecuTorch
TensorRT
CoreML
TVM
other runtime targets

Adding A Policy

If the policy fits an existing runner:

class MyPolicy(PreTrainedPolicy):
    def get_inference_type(self):
        return "single_pass"

    def get_export_modules(self):
        return {"model": MyInferenceWrapper(self)}

    def prepare_inputs(self, example_batch):
        return {
            "model": ExportInputs(
                tensors=(example_batch["observation.state"],),
                input_names=["observation.state"],
                output_names=["action"],
            )
        }

    def export_processor_specs(self, include_normalization, stats_artifact, assets=None):
        return [normalize_spec(...), tokenize_spec(...)], [denormalize_spec(...)]

If it needs a new runtime pattern:

class MyRunner:
    type = "my_runner"

    def run(self, session, observation, **kwargs):
        encoded = session.run("encoder", observation)
        decoded = session.run("decoder", encoded)
        return decoded["action"]

Expected extension rule:

new policy = export modules + choose/add runner + emit executable processors + bundle assets

Why ACT + PI05

ACT proves the feedforward/action-chunk path:

ACT -> single_pass -> normalize/denormalize -> action chunk

PI05 proves the VLA/KV-cache/tokenized path:

PI05 -> kv_cache_flow -> tokenize/task handling -> encoder stage -> denoise stage -> action chunk

Runtime processor types implemented in this PR:

normalize
denormalize
relative_actions
absolute_actions
pi05_prepare_state
tokenize

Validation

Commands used locally:

uv run pytest tests/export/test_runtime_processors.py -q
uv run pytest tests/export/test_export_act.py -q
uv run pytest tests/export/test_export_pi05.py -q
uv run pytest tests/export/test_processor_specs.py tests/export/test_runtime_processors.py tests/export/test_manifest.py tests/export/test_failfast_errors.py tests/export/test_runner_registry.py -q
uv run pre-commit run --all-files

Additional local ignored validation script:

uv run python examples/policy_export/_export_demo/validate_policy_export.py
uv run python examples/policy_export/_export_demo/validate_policy_export.py --run-pi05 --skip-openvino

Observed ACT real-checkpoint validation:

ACT PyTorch vs ONNX Runtime:
  max_abs ~= 2.24e-4
  mean_abs ~= 1.17e-5

ACT ONNX Runtime vs OpenVINO:
  max_abs ~= 5.36e-7
  mean_abs ~= 7.35e-8

Observed real PI05 base validation:

real checkpoint: lerobot/pi05_base
runtime input: raw task string -> exported tokenize processor
stage-wise ONNX parity:
  encoder past_key/value max_abs ~= 1e-5 to 1e-4
  denoise v_t max_abs ~= 3e-6

e2e denoise-loop diagnostic:
  larger accumulated drift is observed on the real 10-step-size checkpoint path
  this is reported as a diagnostic, not hidden as strict parity

The committed tests still use synthetic compact PI05 configs for CI-portable parity. The local script can validate the real checkpoint path, but it is intentionally ignored because it is multi-GB and slow.

Scope And Limits

Included:

ONNX export
OpenVINO runtime loading
ACT support
PI05 support
executable exported runtime processors
manifest/runtime validation

Not included:

native OpenVINO IR serialization
TensorRT backend
ExecuTorch backend
full policy-zoo coverage
every eager LeRobot processor

Known constraints:

export-time still uses PyTorch
exported runtime inference is the torch-free path
OpenVINO currently consumes ONNX artifacts
real PI05 e2e ONNX drift is larger than stage-wise drift due to accumulated denoise-loop differences

Relationship to lerobot-rollout

The new lerobot-rollout CLI (#3413) standardizes the deployment loop and currently loads policies as PreTrainedPolicy (torch nn.Module) instances. Our PR is orthogonal such that it adds a torch-free runtime path so policies can be deployed on edge targets that cannot ship PyTorch.

The two features compose well, but plugging an ExportedPolicy into lerobot-rollout requires a small integration. I'd like maintainer input on the preferred shape before implementing it.

A few options would be:

  1. PolicyLike protocol in lerobot.rollout instead of PretrainedPolicy
  2. Wrapper-only — subclass PreTrainedPolicy in a torch-using bridge module.
  3. Dedicated InferenceEngine for exported policies — selectable via --inference.type=exported_*.

Happy to land it as a separate small refactor PR first if that's preferred.
This is out of scope for this PR either way, so I'm flagging now so we can align on direction.

samet-akcay and others added 14 commits April 30, 2026 16:07
…ends

Adds the lerobot.export package with the Exporter orchestrator, ONNX and OpenVINO backends, action_chunking and kv_cache runners, processor specs, and manifest schema. Includes fail-fast validation for missing normalization stats, tokenizer assets, and unknown OpenVINO devices.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Wires ACT through the action_chunking runner. Numerical parity verified at rtol=1e-5 / atol=1e-5 against PyTorch eager on both ONNX Runtime and OpenVINO.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
…NX, 0.08 OpenVINO)

Wires PI05 through the kv_cache runner. Includes a fp64→fp32 fix to create_sinusoidal_pos_embedding to eliminate Erf(double) drift in the chained Euler denoise loop. Numerical parity: rtol=0.02 / atol=0.006 on ONNX Runtime; rtol=0.08 / atol=0.08 on OpenVINO (cross-runtime IR optimization gap).

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Adds docs/design/policy-export.md (design RFC) and examples/policy_export/walkthrough.ipynb (end-to-end reproducible walkthrough for ACT and PI05 across ONNX Runtime and OpenVINO).

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Disambiguates the runner from generic KV-cache attention by encoding both
load-bearing properties of PI05-style policies it serves: a prefix encoder
with cached KV attention plus flow-matching Euler integration in the
denoise loop. Leaves room for a future sibling runner (e.g. fused
flow-matching for GR00T-style policies with no prefix encoder) without
overloading the existing name.

- src/lerobot/export/runners/kv_cache.py -> kv_cache_flow.py
- KVCacheRunner -> KVCacheFlowRunner
- KVCacheExportConfig -> KVCacheFlowExportConfig
- manifest runner type 'kv_cache' -> 'kv_cache_flow'
- PI05 get_inference_type() returns 'kv_cache_flow'
The processor-spec refactor moved export_assets, export_stats, and
export_processor_specs onto PreTrainedPolicy, which the toy backend
test's ToyPolicy(nn.Module) does not inherit. Add inline no-op stubs
so the 'register a runner+backend without core edits' test exercises
the public extension surface again.
Copilot AI review requested due to automatic review settings May 1, 2026 11:13
@github-actions github-actions Bot added policies Items related to robot policies tests Problems with test coverage, failures, or improvements to testing labels May 1, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new modular “policy_package” export + torch-free runtime execution path for LeRobot policies, with ONNX serialization and OpenVINO runtime loading, and validates the contract via a comprehensive export test suite.

Changes:

  • Introduces an export subsystem (manifest schema, backends, runners, policy runtime wrapper) and integrates it into PreTrainedPolicy.
  • Implements executable runtime processors (normalize/denormalize, relative/absolute actions, PI05 state prep, tokenize) and runner patterns (single_pass, kv_cache_flow) for ACT and PI05.
  • Adds extensive tests for manifest stability/schema, runner/backend registries, processors, and numerical parity (where deps are available).

Reviewed changes

Copilot reviewed 36 out of 38 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/export/test_runtime_processors.py Tests that exported runtime processor specs execute (normalize/denormalize, relative/absolute, tokenize).
tests/export/test_runner_registry.py Validates runner selection/registry behavior and runner export invariants.
tests/export/test_processor_specs.py Checks processor spec construction + JSON roundtrips (incl. PI05 + normalization).
tests/export/test_manifest_stability.py Ensures manifest output is stable across re-exports (excluding created_at).
tests/export/test_manifest.py Adds schema + Normalizer behavior tests and fixture roundtrip validation.
tests/export/test_failfast_errors.py Verifies fail-fast error messages for missing stats and manifest parse errors.
tests/export/test_export_pi05.py End-to-end PI05 export/runtime parity and stage-wise accuracy tests (optional deps).
tests/export/test_export_act.py End-to-end ACT export/runtime parity tests for ONNX/OpenVINO (optional deps).
tests/export/test_backend_registry.py Verifies backend/runner registries, plugin guarantees, and toy backend/runner integration.
tests/export/fixtures/manifest_act_converged.json Adds a converged ACT manifest fixture for stability/roundtrip checks.
tests/export/conftest.py Shared fixtures/utilities for export tests (policy factories, parity helpers, tokenizer cache loader).
tests/export/init.py Marks tests.export as a package for shared utilities.
src/lerobot/policies/pretrained.py Adds export/to_onnx/to_openvino/from_exported APIs plus default stats/assets/processor-spec hooks.
src/lerobot/policies/pi05/modeling_pi05.py Adds PI05 exportable modules, export protocol implementation, and tokenizer asset bundling.
src/lerobot/policies/act/modeling_act.py Adds ACT export wrapper module and export protocol implementation for single-pass runner.
src/lerobot/export/runners/single_pass.py New runner for feedforward chunk-emitting policies (ACT-style).
src/lerobot/export/runners/kv_cache_flow.py New runner for KV-cache encode + iterative denoise flow (PI05-style).
src/lerobot/export/runners/base.py Defines runner protocol, registry, ExportModule, and shared helpers.
src/lerobot/export/runners/init.py Implements auto-discovery import of runner modules for registration.
src/lerobot/export/protocols.py Defines Exportable protocol and ExportInputs contract used by policies/runners.
src/lerobot/export/processors/runtime.py Implements torch-free runtime execution for exported processor specs.
src/lerobot/export/processors/pi05.py Builds PI05-specific processor specs (relative/prepare_state/tokenize + absolute).
src/lerobot/export/processors/normalize.py Builds normalize/denormalize processor specs from grouped modes/features.
src/lerobot/export/processors/init.py Exposes processor-spec builders and runtime pipeline builder.
src/lerobot/export/policy.py Adds ExportedPolicy runtime wrapper (pre/post processing + runner orchestration).
src/lerobot/export/normalize.py Implements stats IO and normalization/denormalization for multiple modes.
src/lerobot/export/manifest.py Introduces manifest dataclasses + (de)serialization and ProcessorSpec flattening.
src/lerobot/export/interfaces.py Defines backend and runtime session protocols to decouple runners/backends.
src/lerobot/export/exporter.py Implements export_policy pipeline: runner selection, backend serialization, manifest emission.
src/lerobot/export/configs.py Adds export config dataclasses for runner families (single-pass, KV-cache flow).
src/lerobot/export/backends/openvino.py Runtime-only OpenVINO backend that loads ONNX artifacts and runs compiled requests.
src/lerobot/export/backends/onnx.py ONNX backend for serialization (torch.onnx.export) and runtime (onnxruntime).
src/lerobot/export/backends/base.py Backend registry and artifact path resolution helper.
src/lerobot/export/backends/init.py Implements auto-discovery import of backend modules for registration.
src/lerobot/export/_package_utils.py Package utilities: example batch generation, hardware config, stats extraction, config snapshot.
src/lerobot/export/init.py Public API surface for export/load of policy packages.
pyproject.toml Adds export extra dependencies and updates spellchecker ignore patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/lerobot/policies/pretrained.py Outdated
Comment thread src/lerobot/export/processors/runtime.py
Comment thread tests/export/test_runtime_processors.py Outdated
Comment thread src/lerobot/policies/pi05/modeling_pi05.py Outdated
Comment thread src/lerobot/export/policy.py
- Drop unreachable manifest-backend branch in _detect_backend_name();
  ModelConfig has no backend field and Manifest.to_dict() drops unknown
  JSON, so the branch could never trigger. Backend selection is now
  documented as suffix-inferred or caller-passed.
- Rewrite to_openvino() docstring to reflect actual behavior: manifest
  does not record the backend; users must pass backend='openvino' to
  load_exported_policy() (or rely on .onnx-suffix inference into ONNX).
- Strip 'task' key from _TokenizeProcessor output so downstream runners
  (e.g. SinglePassRunner) that astype(np.float32) every value do not
  crash on the residual raw string.
- Improve PI05 export_assets() error: identify the real failure
  (AutoTokenizer.from_pretrained(local_files_only=True) cache miss) and
  include the underlying exception instead of pointing at a directory
  that was just created.
build_processor_pipeline() requires Path; the relative_actions and
unknown-processor tests previously passed package_path=None with
# type: ignore[arg-type]. Use the existing tmp_path fixture so the
type signature is honored without suppression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

policies Items related to robot policies tests Problems with test coverage, failures, or improvements to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants