feat(export): add modular policy export with ONNX and OpenVINO runtime by samet-akcay · Pull Request #3493 · huggingface/lerobot

samet-akcay · 2026-05-01T11:13:21Z

TL;DR

This PR adds a modular export path for LeRobot policies. I limited the scope to two different policy families just to show the concept, but the design is scalable to cover all of the policies with different backends.

ACT  -> ONNX package -> ONNX Runtime / OpenVINO runtime
PI05 -> ONNX package -> ONNX Runtime / OpenVINO runtime

The main point is the shape:

raw observation
-> exported runtime preprocessors
-> runner
-> backend runtime session(s)
-> exported runtime postprocessors
-> action

User Examples

from lerobot.policies.act.modeling_act import ACTPolicy
from lerobot.export import load_exported_policy

policy = ACTPolicy.from_pretrained("lerobot/act_aloha_sim_transfer_cube_human")
policy.config.stats = load_or_attach_checkpoint_stats(policy)  # required for normalized export

policy.export("./act_export", backend="onnx", example_batch=example_batch)

exported = load_exported_policy("./act_export", backend="onnx", device="cpu")
action = exported.select_action(raw_observation)
chunk = exported.predict_action_chunk(raw_observation)

OpenVINO uses the same exported package contract:

exported = load_exported_policy("./act_export", backend="openvino", device="CPU")
action = exported.select_action(raw_observation)

PI05 uses the same runtime API, but the policy family is different:

from lerobot.policies.pi05.modeling_pi05 import PI05Policy
from lerobot.export import export_policy, load_exported_policy

policy = PI05Policy.from_pretrained("lerobot/pi05_base", strict=False)

export_policy(
    policy,
    "./pi05_export",
    backend="onnx",
    example_batch=example_batch_with_language_tokens,
    include_normalization=False,
)

exported = load_exported_policy("./pi05_export", backend="onnx")
action_chunk = exported.predict_action_chunk(
    {
        "observation.images.base_0_rgb": image,
        "observation.images.left_wrist_0_rgb": left_wrist_image,
        "observation.images.right_wrist_0_rgb": right_wrist_image,
        "observation.state": state,
        "task": "pick up the red block",
    },
    num_steps=3,
)

PI05 example contract:

export-time example_batch:
  includes language tokens/masks because tracing needs tensor inputs

runtime observation:
  may include raw task string
  tokenize processor emits language tokens/masks inside exported runtime

Package Shape

export_dir/
  manifest.json
  artifacts/
    model.onnx                 # ACT
    encoder.onnx, denoise.onnx # PI05
  stats.safetensors            # when normalization is included
  tokenizer/                   # when tokenizer runtime processor is included

Example executable manifest processors:

{"type": "normalize", "features": ["observation.state"], "artifact": "stats.safetensors"}
{"type": "tokenize", "artifact": "tokenizer"}
{"type": "denormalize", "features": ["action"], "artifact": "stats.safetensors"}

Internal Shape

Backend.open(...) -> _RuntimeSession
Runner.run(...)   -> orchestrates runtime stages
processors        -> torch-free exported pre/post path
manifest          -> package contract

policy decides:
  runner type
  export modules/stages
  processor specs
  assets/stats

backend decides:
  artifact serialization
  runtime session loading
  named stage execution

No extra adapter stack is needed between LeRobot and runtime backends.

Adding A Backend

Pseudo-code for TensorRT:

class _TensorRTRuntimeSession(_RuntimeSession):
    def __init__(self, package_dir, manifest, device):
        self.engines = load_engines(package_dir, manifest, device)

    def run(self, stage_name, inputs):
        return self.engines[stage_name].execute(inputs)


class TensorRTBackend(Backend):
    name = "tensorrt"

    def export(self, module, output_path):
        onnx_path = export_stage_to_onnx(module)
        engine_path = build_tensorrt_engine(onnx_path, output_path)
        return engine_path.name

    def open(self, artifacts_dir, manifest, device):
        return _TensorRTRuntimeSession(artifacts_dir, manifest, device)

register_backend("tensorrt", TensorRTBackend())

Expected extension rule:

new backend = serialize stages + open runtime session + run named stage

This should also fit future backends such as:

ExecuTorch
TensorRT
CoreML
TVM
other runtime targets

Adding A Policy

If the policy fits an existing runner:

class MyPolicy(PreTrainedPolicy):
    def get_inference_type(self):
        return "single_pass"

    def get_export_modules(self):
        return {"model": MyInferenceWrapper(self)}

    def prepare_inputs(self, example_batch):
        return {
            "model": ExportInputs(
                tensors=(example_batch["observation.state"],),
                input_names=["observation.state"],
                output_names=["action"],
            )
        }

    def export_processor_specs(self, include_normalization, stats_artifact, assets=None):
        return [normalize_spec(...), tokenize_spec(...)], [denormalize_spec(...)]

If it needs a new runtime pattern:

class MyRunner:
    type = "my_runner"

    def run(self, session, observation, **kwargs):
        encoded = session.run("encoder", observation)
        decoded = session.run("decoder", encoded)
        return decoded["action"]

Expected extension rule:

new policy = export modules + choose/add runner + emit executable processors + bundle assets

Why ACT + PI05

ACT proves the feedforward/action-chunk path:

ACT -> single_pass -> normalize/denormalize -> action chunk

PI05 proves the VLA/KV-cache/tokenized path:

PI05 -> kv_cache_flow -> tokenize/task handling -> encoder stage -> denoise stage -> action chunk

Runtime processor types implemented in this PR:

normalize
denormalize
relative_actions
absolute_actions
pi05_prepare_state
tokenize

Validation

Commands used locally:

uv run pytest tests/export/test_runtime_processors.py -q
uv run pytest tests/export/test_export_act.py -q
uv run pytest tests/export/test_export_pi05.py -q
uv run pytest tests/export/test_processor_specs.py tests/export/test_runtime_processors.py tests/export/test_manifest.py tests/export/test_failfast_errors.py tests/export/test_runner_registry.py -q
uv run pre-commit run --all-files

Additional local ignored validation script:

uv run python examples/policy_export/_export_demo/validate_policy_export.py
uv run python examples/policy_export/_export_demo/validate_policy_export.py --run-pi05 --skip-openvino

Observed ACT real-checkpoint validation:

ACT PyTorch vs ONNX Runtime:
  max_abs ~= 2.24e-4
  mean_abs ~= 1.17e-5

ACT ONNX Runtime vs OpenVINO:
  max_abs ~= 5.36e-7
  mean_abs ~= 7.35e-8

Observed real PI05 base validation:

real checkpoint: lerobot/pi05_base
runtime input: raw task string -> exported tokenize processor
stage-wise ONNX parity:
  encoder past_key/value max_abs ~= 1e-5 to 1e-4
  denoise v_t max_abs ~= 3e-6

e2e denoise-loop diagnostic:
  larger accumulated drift is observed on the real 10-step-size checkpoint path
  this is reported as a diagnostic, not hidden as strict parity

The committed tests still use synthetic compact PI05 configs for CI-portable parity. The local script can validate the real checkpoint path, but it is intentionally ignored because it is multi-GB and slow.

Scope And Limits

Included:

ONNX export
OpenVINO runtime loading
ACT support
PI05 support
executable exported runtime processors
manifest/runtime validation

Not included:

native OpenVINO IR serialization
TensorRT backend
ExecuTorch backend
full policy-zoo coverage
every eager LeRobot processor

Known constraints:

export-time still uses PyTorch
exported runtime inference is the torch-free path
OpenVINO currently consumes ONNX artifacts
real PI05 e2e ONNX drift is larger than stage-wise drift due to accumulated denoise-loop differences

Relationship to `lerobot-rollout`

The new lerobot-rollout CLI (#3413) standardizes the deployment loop and currently loads policies as PreTrainedPolicy (torch nn.Module) instances. Our PR is orthogonal such that it adds a torch-free runtime path so policies can be deployed on edge targets that cannot ship PyTorch.

The two features compose well, but plugging an ExportedPolicy into lerobot-rollout requires a small integration. I'd like maintainer input on the preferred shape before implementing it.

A few options would be:

PolicyLike protocol in lerobot.rollout instead of PretrainedPolicy
Wrapper-only — subclass PreTrainedPolicy in a torch-using bridge module.
Dedicated InferenceEngine for exported policies — selectable via --inference.type=exported_*.

Happy to land it as a separate small refactor PR first if that's preferred.
This is out of scope for this PR either way, so I'm flagging now so we can align on direction.

…ends Adds the lerobot.export package with the Exporter orchestrator, ONNX and OpenVINO backends, action_chunking and kv_cache runners, processor specs, and manifest schema. Includes fail-fast validation for missing normalization stats, tokenizer assets, and unknown OpenVINO devices. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Wires ACT through the action_chunking runner. Numerical parity verified at rtol=1e-5 / atol=1e-5 against PyTorch eager on both ONNX Runtime and OpenVINO. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

…NX, 0.08 OpenVINO) Wires PI05 through the kv_cache runner. Includes a fp64→fp32 fix to create_sinusoidal_pos_embedding to eliminate Erf(double) drift in the chained Euler denoise loop. Numerical parity: rtol=0.02 / atol=0.006 on ONNX Runtime; rtol=0.08 / atol=0.08 on OpenVINO (cross-runtime IR optimization gap). Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Adds docs/design/policy-export.md (design RFC) and examples/policy_export/walkthrough.ipynb (end-to-end reproducible walkthrough for ACT and PI05 across ONNX Runtime and OpenVINO). Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Disambiguates the runner from generic KV-cache attention by encoding both load-bearing properties of PI05-style policies it serves: a prefix encoder with cached KV attention plus flow-matching Euler integration in the denoise loop. Leaves room for a future sibling runner (e.g. fused flow-matching for GR00T-style policies with no prefix encoder) without overloading the existing name. - src/lerobot/export/runners/kv_cache.py -> kv_cache_flow.py - KVCacheRunner -> KVCacheFlowRunner - KVCacheExportConfig -> KVCacheFlowExportConfig - manifest runner type 'kv_cache' -> 'kv_cache_flow' - PI05 get_inference_type() returns 'kv_cache_flow'

The processor-spec refactor moved export_assets, export_stats, and export_processor_specs onto PreTrainedPolicy, which the toy backend test's ToyPolicy(nn.Module) does not inherit. Add inline no-op stubs so the 'register a runner+backend without core edits' test exercises the public extension surface again.

Copilot

Pull request overview

Adds a new modular “policy_package” export + torch-free runtime execution path for LeRobot policies, with ONNX serialization and OpenVINO runtime loading, and validates the contract via a comprehensive export test suite.

Changes:

Introduces an export subsystem (manifest schema, backends, runners, policy runtime wrapper) and integrates it into PreTrainedPolicy.
Implements executable runtime processors (normalize/denormalize, relative/absolute actions, PI05 state prep, tokenize) and runner patterns (single_pass, kv_cache_flow) for ACT and PI05.
Adds extensive tests for manifest stability/schema, runner/backend registries, processors, and numerical parity (where deps are available).

Reviewed changes

Copilot reviewed 36 out of 38 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/export/test_runtime_processors.py	Tests that exported runtime processor specs execute (normalize/denormalize, relative/absolute, tokenize).
tests/export/test_runner_registry.py	Validates runner selection/registry behavior and runner export invariants.
tests/export/test_processor_specs.py	Checks processor spec construction + JSON roundtrips (incl. PI05 + normalization).
tests/export/test_manifest_stability.py	Ensures manifest output is stable across re-exports (excluding `created_at`).
tests/export/test_manifest.py	Adds schema + Normalizer behavior tests and fixture roundtrip validation.
tests/export/test_failfast_errors.py	Verifies fail-fast error messages for missing stats and manifest parse errors.
tests/export/test_export_pi05.py	End-to-end PI05 export/runtime parity and stage-wise accuracy tests (optional deps).
tests/export/test_export_act.py	End-to-end ACT export/runtime parity tests for ONNX/OpenVINO (optional deps).
tests/export/test_backend_registry.py	Verifies backend/runner registries, plugin guarantees, and toy backend/runner integration.
tests/export/fixtures/manifest_act_converged.json	Adds a converged ACT manifest fixture for stability/roundtrip checks.
tests/export/conftest.py	Shared fixtures/utilities for export tests (policy factories, parity helpers, tokenizer cache loader).
tests/export/init.py	Marks `tests.export` as a package for shared utilities.
src/lerobot/policies/pretrained.py	Adds `export`/`to_onnx`/`to_openvino`/`from_exported` APIs plus default stats/assets/processor-spec hooks.
src/lerobot/policies/pi05/modeling_pi05.py	Adds PI05 exportable modules, export protocol implementation, and tokenizer asset bundling.
src/lerobot/policies/act/modeling_act.py	Adds ACT export wrapper module and export protocol implementation for single-pass runner.
src/lerobot/export/runners/single_pass.py	New runner for feedforward chunk-emitting policies (ACT-style).
src/lerobot/export/runners/kv_cache_flow.py	New runner for KV-cache encode + iterative denoise flow (PI05-style).
src/lerobot/export/runners/base.py	Defines runner protocol, registry, `ExportModule`, and shared helpers.
src/lerobot/export/runners/init.py	Implements auto-discovery import of runner modules for registration.
src/lerobot/export/protocols.py	Defines `Exportable` protocol and `ExportInputs` contract used by policies/runners.
src/lerobot/export/processors/runtime.py	Implements torch-free runtime execution for exported processor specs.
src/lerobot/export/processors/pi05.py	Builds PI05-specific processor specs (relative/prepare_state/tokenize + absolute).
src/lerobot/export/processors/normalize.py	Builds normalize/denormalize processor specs from grouped modes/features.
src/lerobot/export/processors/init.py	Exposes processor-spec builders and runtime pipeline builder.
src/lerobot/export/policy.py	Adds `ExportedPolicy` runtime wrapper (pre/post processing + runner orchestration).
src/lerobot/export/normalize.py	Implements stats IO and normalization/denormalization for multiple modes.
src/lerobot/export/manifest.py	Introduces manifest dataclasses + (de)serialization and ProcessorSpec flattening.
src/lerobot/export/interfaces.py	Defines backend and runtime session protocols to decouple runners/backends.
src/lerobot/export/exporter.py	Implements `export_policy` pipeline: runner selection, backend serialization, manifest emission.
src/lerobot/export/configs.py	Adds export config dataclasses for runner families (single-pass, KV-cache flow).
src/lerobot/export/backends/openvino.py	Runtime-only OpenVINO backend that loads ONNX artifacts and runs compiled requests.
src/lerobot/export/backends/onnx.py	ONNX backend for serialization (torch.onnx.export) and runtime (onnxruntime).
src/lerobot/export/backends/base.py	Backend registry and artifact path resolution helper.
src/lerobot/export/backends/init.py	Implements auto-discovery import of backend modules for registration.
src/lerobot/export/_package_utils.py	Package utilities: example batch generation, hardware config, stats extraction, config snapshot.
src/lerobot/export/init.py	Public API surface for export/load of policy packages.
pyproject.toml	Adds `export` extra dependencies and updates spellchecker ignore patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Drop unreachable manifest-backend branch in _detect_backend_name(); ModelConfig has no backend field and Manifest.to_dict() drops unknown JSON, so the branch could never trigger. Backend selection is now documented as suffix-inferred or caller-passed. - Rewrite to_openvino() docstring to reflect actual behavior: manifest does not record the backend; users must pass backend='openvino' to load_exported_policy() (or rely on .onnx-suffix inference into ONNX). - Strip 'task' key from _TokenizeProcessor output so downstream runners (e.g. SinglePassRunner) that astype(np.float32) every value do not crash on the residual raw string. - Improve PI05 export_assets() error: identify the real failure (AutoTokenizer.from_pretrained(local_files_only=True) cache miss) and include the underlying exception instead of pointing at a directory that was just created.

build_processor_pipeline() requires Path; the relative_actions and unknown-processor tests previously passed package_path=None with # type: ignore[arg-type]. Use the existing tmp_path fixture so the type signature is honored without suppression.

samet-akcay and others added 14 commits April 30, 2026 16:07

docs(export): remove outdated policy export framework documentation

caede99

refactor(export): make runtime session internal

553cd77

refactor(export): rename runtime adapters to runtime sessions

aa10ed2

refactor(export): make runtime sessions private

5b78518

feat(export): execute torch-free runtime processors

978659c

refactor(export): simplify processor spec hooks

17b9161

chore(export): drop walkthrough notebook

2b0819d

fix(export): preserve tokenized runtime inputs

f694788

Copilot AI review requested due to automatic review settings May 1, 2026 11:13

github-actions Bot added policies Items related to robot policies tests Problems with test coverage, failures, or improvements to testing labels May 1, 2026

Copilot started reviewing on behalf of samet-akcay May 1, 2026 11:14 View session

Copilot AI reviewed May 1, 2026

View reviewed changes

Comment thread src/lerobot/policies/pretrained.py Outdated

Comment thread src/lerobot/export/processors/runtime.py

Comment thread tests/export/test_runtime_processors.py Outdated

Comment thread src/lerobot/policies/pi05/modeling_pi05.py Outdated

Comment thread src/lerobot/export/policy.py

samet-akcay added 2 commits May 1, 2026 14:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(export): add modular policy export with ONNX and OpenVINO runtime#3493

feat(export): add modular policy export with ONNX and OpenVINO runtime#3493
samet-akcay wants to merge 16 commits intohuggingface:mainfrom
samet-akcay:feat/policy-export

samet-akcay commented May 1, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

samet-akcay commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TL;DR

User Examples

Package Shape

Internal Shape

Adding A Backend

Adding A Policy

Why ACT + PI05

Validation

Scope And Limits

Relationship to lerobot-rollout

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

samet-akcay commented May 1, 2026 •

edited

Loading

Relationship to `lerobot-rollout`