NVIDIA NeMo Speech

Checkout our HuggingFace🤗 collection for the latest open weight checkpoints and demos!

Updates

2026-03: Nemotron-Speech-Streaming v2603 has been updated. It has been trained on a larger and more diverse corpus, resulting in lower WER across all latency modes. Try out the demo and check out the NIM.
2026-03: MagpieTTS v2602 has been released with support for 9 languages(En, Es, De, Fr, Vi, It, Zh, Hi, Ja). Try out the demo and check out the NIM.
2026-01: Nemotron-Speech-Streaming was released: One checkpoint that enables users to pick their optimal point on the latency-accuracy Pareto curve!
2026-01: MagpieTTS was released.
2026: This repo has pivoted to focus on audio, speech, and multimodal LLM. For the last NeMo release with support for more modalities, see v2.7.0
2025-08: Parakeet V3 and Canary V2 have been released with speech recognition and translation support for 25 European languages.
2025-06: Canary-Qwen-2.5B has been released with record-setting 5.63% WER on English Open ASR Leaderboard.

Introduction

NVIDIA NeMo Speech is built for researchers and PyTorch developers working on Speech models including Automatic Speech Recognition (ASR), Text to Speech (TTS), and Speech LLMs. It is designed to help you efficiently create, customize, and deploy new It is designed to help you efficiently create, customize, and deploy new AI models by leveraging existing code and pre-trained model checkpoints.

For technical documentation, please see the NeMo Framework User Guide.

Requirements

Python 3.12 or above
Pytorch 2.6 or above
NVIDIA GPU (if you intend to do model training)

As of Pytorch 2.6, torch.load defaults to using weights_only=True. Some model checkpoints may require using weights_only=False. In this case, you can set the env var TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD=1 before running code that uses torch.load. However, this should only be done with trusted files. Loading files from untrusted sources with more than weights only can have the risk of arbitrary code execution.

Developer Documentation

Version	Status	Description
Latest		Documentation of the latest (i.e. main) branch.
Stable		Documentation of the stable (i.e. most recent release) - To be added

Install NeMo Speech

NeMo Speech is installable via pip: pip install 'nemo-toolkit[all]'

Contribute to NeMo

We welcome community contributions! Please refer to CONTRIBUTING.md for the process.

Licenses

NeMo is licensed under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 9,256 Commits
.github		.github
docker		docker
docs		docs
examples		examples
external		external
nemo		nemo
requirements		requirements
scripts		scripts
tests		tests
tools		tools
tutorials		tutorials
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.flake8		.flake8
.flake8.other		.flake8.other
.flake8.speech		.flake8.speech
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.pylintrc.other		.pylintrc.other
.pylintrc.speech		.pylintrc.speech
.readthedocs.yml		.readthedocs.yml
.secrets.baseline		.secrets.baseline
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
codecov.yml		codecov.yml
nemo_dependencies.py		nemo_dependencies.py
pyproject.toml		pyproject.toml
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NVIDIA NeMo Speech

Updates

Introduction

Requirements

Developer Documentation

Install NeMo Speech

Contribute to NeMo

Licenses

About

Uh oh!

Releases 84

Packages

Uh oh!

Uh oh!

Contributors 509

Languages

Folders and files

Latest commit

History

Repository files navigation

NVIDIA NeMo Speech

Updates

Introduction

Requirements

Developer Documentation

Install NeMo Speech

Contribute to NeMo

Licenses

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 84

Packages 0

Uh oh!

Uh oh!

Contributors 509

Languages

Packages