Skip to content

fix(train): skip EpisodeAwareSampler for streaming datasets#3501

Open
jashshah999 wants to merge 1 commit intohuggingface:mainfrom
jashshah999:fix/streaming-sampler-crash
Open

fix(train): skip EpisodeAwareSampler for streaming datasets#3501
jashshah999 wants to merge 1 commit intohuggingface:mainfrom
jashshah999:fix/streaming-sampler-crash

Conversation

@jashshah999
Copy link
Copy Markdown
Contributor

Summary

Training with --dataset.streaming=true and a policy that uses drop_n_last_frames (e.g. SARM) crashes with ValueError: DataLoader with IterableDataset: expected unspecified sampler option.

Fix: detect IterableDataset and skip the EpisodeAwareSampler, with a warning that end-of-episode filtering is not available in streaming mode.

Test plan

  • lerobot-train --dataset.streaming=true --policy.type=sarm no longer crashes
  • Non-streaming training with SARM still uses EpisodeAwareSampler as before
  • Other policies unaffected

Fixes #3436.

PyTorch DataLoader forbids passing a sampler with IterableDataset.
When training with --dataset.streaming=true and a policy that uses
drop_n_last_frames (e.g. SARM), skip the sampler and log a warning
that end-of-episode filtering is not available in streaming mode.

Fixes huggingface#3436.
@github-actions github-actions Bot added the configuration Problems with configuration files or settings label May 4, 2026
@jashshah999 jashshah999 force-pushed the fix/streaming-sampler-crash branch from 2333ffe to 3302472 Compare May 4, 2026 01:15
@github-actions github-actions Bot removed the configuration Problems with configuration files or settings label May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

lerobot-train with --dataset.streaming=true crashes for SARM due to EpisodeAwareSampler on IterableDataset

1 participant