feat(machinery): add Argos offline translation backend#18572
feat(machinery): add Argos offline translation backend#18572knk125 wants to merge 13 commits intoWeblateOrg:mainfrom
Conversation
|
Can you please add tests? |
There was a problem hiding this comment.
Pull request overview
Adds a new offline machine translation backend based on Argos Translate, integrating it into Weblate’s machinery registry and packaging extras so it can be enabled as a service.
Changes:
- Added
ArgosTranslationmachinery backend (weblate/machinery/argos.py) usingargostranslateinstalled models. - Registered the backend in
WEBLATE_MACHINERYso it’s discoverable/configurable. - Added an
argosoptional dependency extra and included it in theallextra.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
weblate/machinery/models.py |
Registers the new machinery class so it can be loaded by the machinery class loader. |
weblate/machinery/argos.py |
Implements Argos Translate-backed offline translation and language-pair support checks. |
pyproject.toml |
Adds argostranslate as an optional dependency and wires it into the all extra. |
a9deab0 to
1954957
Compare
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Thanks for your contribution. Besides the previous comments, this has CI failures and lacks documentation. make -C docs update-docs will bring you the basic backend documentation generated from the code (this should be documented and we will fix that, see #18724).
| src = source_language.split("-")[0].lower() | ||
| tgt = target_language.split("-")[0].lower() | ||
|
|
||
| installed_languages = argostranslate.translate.get_installed_languages() |
There was a problem hiding this comment.
How is language installation supposed to work?
|
PS: What I find super confusing is that we're using Argos CI for screenshot testing, and this brings another thing named Argos just in an absolutely different scope. But this is nothing we can address. |
|
Co-authored-by: Michal Čihař <michal@cihar.com>
Co-authored-by: Michal Čihař <michal@cihar.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from typing import TYPE_CHECKING | ||
|
|
||
| import argostranslate.translate | ||
|
|
||
| from .base import MachineTranslation |
There was a problem hiding this comment.
argostranslate is imported at module import time, which will raise ImportError when the optional dependency isn’t installed. Because this backend is listed in WEBLATE_MACHINERY, an ImportError here will prevent the ClassLoader from loading the service cleanly (and can break commands/UI that enumerate machinery). Make the import optional (e.g., try/except) and set is_available based on whether argostranslate is importable, or defer the import into methods.
| src = source_language.split("-")[0].lower() | ||
| tgt = target_language.split("-")[0].lower() | ||
|
|
||
| installed_languages = argostranslate.translate.get_installed_languages() | ||
|
|
||
| src_lang = next( | ||
| (lang for lang in installed_languages if lang.code == src), None | ||
| ) | ||
| tgt_lang = next( | ||
| (lang for lang in installed_languages if lang.code == tgt), None | ||
| ) | ||
|
|
||
| if src_lang and tgt_lang: | ||
| return src_lang.get_translation(tgt_lang) is not None | ||
|
|
There was a problem hiding this comment.
is_supported() calls argostranslate.translate.get_installed_languages() on every support check. This method is called repeatedly during language negotiation, so it’s worth caching the installed languages (or a src→targets map) on the instance (e.g., via @cached_property) to avoid repeated scanning/work.
| except Exception as e: | ||
| logger.debug( | ||
| "Argos translation failed for %s to %s: %s", | ||
| source_language, | ||
| target_language, | ||
| e, |
There was a problem hiding this comment.
The exception handler logs only the exception string, which loses traceback context for debugging. Consider logging with exc_info=True (or using logger.exception at debug level) so failures in local model execution are diagnosable.
| except Exception as e: | |
| logger.debug( | |
| "Argos translation failed for %s to %s: %s", | |
| source_language, | |
| target_language, | |
| e, | |
| except Exception: | |
| logger.debug( | |
| "Argos translation failed for %s to %s", | |
| source_language, | |
| target_language, | |
| exc_info=True, |
| class ArgosTranslationTest(BaseMachineTranslationTest): | ||
| MACHINE_CLS = ArgosTranslation | ||
| EXPECTED_LEN = 1 | ||
|
|
||
| def mock_empty(self) -> None: | ||
| self.skipTest("Not tested") | ||
|
|
||
| def mock_error(self) -> None: | ||
| self.skipTest("Not tested") | ||
|
|
||
| def mock_response(self) -> None: | ||
| pass | ||
|
|
||
| @patch("weblate.machinery.argos.ArgosTranslation.is_available", True) | ||
| @patch("weblate.machinery.argos.argostranslate") | ||
| def test_translate(self, mock_argostranslate, **kwargs) -> None: | ||
| mock_lang_en = MagicMock() |
There was a problem hiding this comment.
This test class inherits BaseMachineTranslationTest.test_validate_settings(), which will call machine.validate_settings() without any patching. In CI, workflows install with --all-extras, so argostranslate will likely be present and validate_settings() will call the real Argos runtime (and likely fail unless models are installed). Override test_validate_settings() (or mock_response()) here to patch weblate.machinery.argos.argostranslate similarly to the other tests, or explicitly skip the validation test for this backend.
| "weblate[alibaba,amazon,argos,gerrit,gelf,google,ldap,mercurial,openai,postgres,sphinx,zxcvbn]" | ||
| ] | ||
| amazon = [ | ||
| "boto3>=1.38.0,<2.0" | ||
| ] | ||
| argos = [ | ||
| "argostranslate" |
There was a problem hiding this comment.
The new argos extra is unpinned (argostranslate without a version range), while most other optional deps are constrained. Add an upper bound (and ideally a minimum known-good version) to reduce breakage risk from upstream releases. Also note that adding argos to the all extra will pull in very large transitive deps (e.g., torch/spacy/stanza), which can significantly impact install time/CI resource usage (workflows use uv sync --all-extras).
| "weblate[alibaba,amazon,argos,gerrit,gelf,google,ldap,mercurial,openai,postgres,sphinx,zxcvbn]" | |
| ] | |
| amazon = [ | |
| "boto3>=1.38.0,<2.0" | |
| ] | |
| argos = [ | |
| "argostranslate" | |
| "weblate[alibaba,amazon,gerrit,gelf,google,ldap,mercurial,openai,postgres,sphinx,zxcvbn]" | |
| ] | |
| amazon = [ | |
| "boto3>=1.38.0,<2.0" | |
| ] | |
| argos = [ | |
| "argostranslate>=1.9.0,<2.0.0" |
Add Argos Translate Offline Machine Translation Backend (issiue: #13106)
Description
This Pull Request introduces a new offline machine translation backend utilizing Argos Translate.
Argos Translate is an open-source offline translation library written in Python. It is based on OpenNMT (Open Neural Machine Translation) and runs inference locally using CTranslate2, providing high-quality translations without relying on third-party external APIs, thereby ensuring complete data privacy.
Implementation Details

Dependency Addition: Added argostranslate as an optional Python dependency under the [all] and [argos] extras in
pyproject.toml
.
Backend Engine: Created
weblate/machinery/argos.py
, which implements the
ArgosTranslation
class extending the base
MachineTranslation
engine.
The engine leverages argostranslate.translate.get_installed_languages() to dynamically determine which language pairs are locally supported via the
is_supported
method.
The
download_translations
method handles querying the local CTranslate2 models synchronously and yielding structured translation suggestions.
Registration: Registered "weblate.machinery.argos.ArgosTranslation" into the WEBLATE_MACHINERY tuple in
weblate/machinery/models.py
.
Testing methodology
Initialized a component inside a local Weblate instance and explicitly enabled the Argos Translate service within the project's machinery_settings config.
Installed the English-Spanish Argos model via Python bindings.
Passed a mock English source string unit to the API endpoint and received accurate translations on the Weblate UI under the "Automatic suggestions" tab.
No external cloud API keys are required for this engine to function since inference executes entirely on the local device's CPU/GPU.