Skip to content

Use the new XetSession API#4116

Open
seanses wants to merge 7 commits intohuggingface:mainfrom
seanses:di/xet-session
Open

Use the new XetSession API#4116
seanses wants to merge 7 commits intohuggingface:mainfrom
seanses:di/xet-session

Conversation

@seanses
Copy link
Copy Markdown
Contributor

@seanses seanses commented Apr 16, 2026

Summary

Migrates all xet upload/download code from the old function-based hf_xet API (upload_files, download_files, global MULTITHREADED_RUNTIME) to the new session-based XetSession API introduced in hf-xet >= 1.5.0.dev0.

Upload:

session = get_xet_session()  # global singleton, fork- and SIGINT-safe

# Strip auth from custom_headers; keep full headers for token refresh
xet_headers = headers.copy()
xet_headers.pop("authorization", None)

with session.new_upload_commit(
    token_refresh_url=refresh_url,    # URL to refresh the xet write token
    token_refresh_headers=headers,    # full headers (incl. auth) for token refresh
    custom_headers=xet_headers,       # headers without auth for xet storage requests
    progress_callback=on_progress,    # optional; receives (group_report, item_reports)
) as commit:
    commit.start_upload_file(path, sha256=sha256_hex)   # from disk
    commit.start_upload_bytes(data, sha256=sha256_hex)  # from memory
# commit() is called automatically on clean exit of the with-block

Download:

with session.new_file_download_group(
    token_refresh_url=xet_file_data.refresh_route,
    token_refresh_headers=headers,
    custom_headers=xet_headers,
    progress_callback=on_progress,
) as group:
    group.start_download_file(XetFileInfo(file_hash, size), dest_path)
# finish() is called automatically on clean exit of the with-block

Architecture:

  • utils/_xet.pyXetSessionHolder is a lightweight wrapper around XetSession with three safety guarantees:
    • SIGINT handling: sigint_abort() cancels the active session and clears the reference so the next get() creates a fresh one (notebook-friendly — subsequent cells still work after Ctrl-C)
    • Fork safety: get() checks os.getpid() against the PID that created the session; on mismatch the inherited session is discarded and a fresh one is created for the child process (prevents deadlock when multiprocessing workers inherit a dead Tokio runtime)
    • Thread safety: a threading.Lock guards all state mutations, important for free-threaded Python (3.14t) where multiple threads can race on get() or sigint_abort() without the GIL serialising them
  • _GLOBAL_XET_HOLDER = XetSessionHolder() is a module-level singleton (analogous to get_session() for HTTP). All upload/download call sites share the same session via get_xet_session() — repo commits, hf_hub_download, bucket uploads/downloads, and snapshot_download all reuse the same underlying Tokio runtime
  • All blocking call sites are wrapped with except KeyboardInterrupt: _GLOBAL_XET_HOLDER.sigint_abort(); raise to abort the in-flight Rust operation and allow clean re-use of xet in the same process
  • xet_connection_info_refresh_url() centralises token-refresh URL construction: handles the bucket repo-type (revision omitted) and the /None revision for PR write-token requests (where the final revision is not yet known)
  • _xet_progress_reporting.py — progress callback receives (group_report, item_reports) matching the new API contract
  • setup.py — minimum hf-xet version bumped to >= 1.5.0.dev3
  • Tests mock hf_xet.XetSession; fork-safety unit and integration tests added to test_xet_utils.py

Note on upload_large_folder and Ctrl-C: xet work there runs inside background worker threads. Python only delivers KeyboardInterrupt to the main thread, and PyErr_CheckSignals() is a no-op in non-main threads — so Ctrl-C responsiveness for upload_large_folder requires a separate fix (shared stop event + worker cooperation) and is out of scope for this PR.

TODO

  • Refactor upload_large_folder to use the XetSession API directly: register each file to the upload commit object immediately after SHA-256 computation (no batching needed — xet handles concurrency internally), then call commit() once at the end. This removes the artificial batch size tuning (UPLOAD_BATCH_SIZE_XET) and makes the xet path in upload_large_folder consistent with the rest of the codebase.

@bot-ci-comment
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 16, 2026

Codecov Report

❌ Patch coverage is 77.14286% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.26%. Comparing base (1daa48b) to head (1965ff5).
⚠️ Report is 313 commits behind head on main.

Files with missing lines Patch % Lines
src/huggingface_hub/utils/_xet.py 73.68% 10 Missing ⚠️
src/huggingface_hub/hf_api.py 63.63% 8 Missing ⚠️
src/huggingface_hub/_commit_api.py 81.25% 3 Missing ⚠️
src/huggingface_hub/file_download.py 78.57% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4116      +/-   ##
==========================================
+ Coverage   75.00%   77.26%   +2.26%     
==========================================
  Files         145      170      +25     
  Lines       13978    19469    +5491     
==========================================
+ Hits        10484    15043    +4559     
- Misses       3494     4426     +932     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@seanses seanses marked this pull request as ready for review April 17, 2026 08:33
Comment thread src/huggingface_hub/hf_api.py Outdated
Comment thread src/huggingface_hub/_commit_api.py
Copy link
Copy Markdown
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on that @seanses! I've left some high level comments on the API design. Overall a very nice addition!

Comment thread tests/test_xet_utils.py Outdated
Comment thread src/huggingface_hub/utils/_xet.py Outdated
Comment thread src/huggingface_hub/utils/_xet.py
Comment thread src/huggingface_hub/_commit_api.py Outdated
Comment thread src/huggingface_hub/_commit_api.py Outdated
Comment thread src/huggingface_hub/_commit_api.py Outdated
Comment thread src/huggingface_hub/_commit_api.py Outdated
Comment thread src/huggingface_hub/file_download.py Outdated
Comment thread src/huggingface_hub/hf_api.py Outdated
Comment thread src/huggingface_hub/utils/_xet_progress_reporting.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 19efd09. Configure here.

Comment thread src/huggingface_hub/_commit_api.py
@seanses seanses requested a review from Wauplin May 2, 2026 03:20
Copy link
Copy Markdown
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very solid now! Extensively tested it locally (mainly the CLI) and works as smoothly as before 👍

Comment thread setup.py

# hf-xet version used in both install_requires and extras["hf_xet"]
HF_XET_VERSION = "hf-xet>=1.4.3,<2.0.0"
HF_XET_VERSION = "hf-xet>=1.5.0.dev3,<2.0.0"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not merge this PR until v1.5.0 is officially released.

Comment on lines +120 to +123
XetSessionHolder,
XetTokenType,
fetch_xet_connection_info_from_repo_info,
get_xet_session,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
XetSessionHolder,
XetTokenType,
fetch_xet_connection_info_from_repo_info,
get_xet_session,
XetTokenType,
fetch_xet_connection_info_from_repo_info,

for now let's not make them public. We can always add them later if a downstream library requires a direct access to the xet session but I doubt.

(you always import them with from .utils._xet import ... in other modules so it shouldn't break anything)

if new_xet_connection is None:
raise XetRefreshTokenError("Failed to refresh xet token")
return new_xet_connection.access_token, new_xet_connection.expiration_unix_epoch
refresh_url = f"{self.endpoint}/api/buckets/{bucket_id}/xet-write-token"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change intentional? (i.e. semi-hardcoding the URL) If not let's replace it with something like

refresh_url = xet_connection_info_refresh_url(
    token_type=XetTokenType.WRITE,
    repo_id=bucket_id,
    repo_type="bucket",
    endpoint=self.endpoint,
)

repo_id (`str`):
A namespace (user or an organization) and a repo name separated by a `/`.
repo_type (`str`):
Type of the repo to upload to: `"model"`, `"dataset"` or `"space"`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Type of the repo to upload to: `"model"`, `"dataset"` or `"space"`.
Type of the repo to upload to.

(handles the new "kernels" and "buckets" as well)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(small refacto seems to work well in my local examples but I have to admit I did not carefully reviewed it - looks fine)

for op in all_bytes_ops:
commit.start_upload_bytes(op.path_or_fileobj, sha256=op.upload_info.sha256.hex())
except KeyboardInterrupt:
_GLOBAL_XET_HOLDER.sigint_abort()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we have get_xet_session, I'm not a fan of having to import the low-level _GLOBAL_XET_HOLDER outside of the _xet module. Could you define a new utility like abort_xet_session which would be an alias for _GLOBAL_XET_HOLDER.sigint_abort()? This way we kinda centralize the logic around _GLOBAL_XET_HOLDER

for xet_info, dest in non_zero_download_items:
group.start_download_file(xet_info, dest)
except KeyboardInterrupt:
_GLOBAL_XET_HOLDER.sigint_abort()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@Wauplin
Copy link
Copy Markdown
Contributor

Wauplin commented May 4, 2026

I've asked Claude code to review the PR and it commented a few things. Nothing critical IMO


3. XetAuthorizationError / XetRefreshTokenError are now dead code

Files: src/huggingface_hub/errors.py:453-457, removed imports in _commit_api.py and hf_api.py

The old code raised XetAuthorizationError on 401 responses and XetRefreshTokenError on failed token refresh. The new session-based API delegates auth to hf_xet (which handles token refresh internally via the URL), so these exceptions are no longer raised anywhere.

They're still defined in errors.py but never imported or used. They're not in __init__.py so they're not public API.

Suggestion: Delete XetAuthorizationError, XetRefreshTokenError, and possibly XetError from errors.py if nothing else uses them, or add a TODO comment. Dead error classes are confusing for future readers.

My opinion (Wauplin): fine to remove them and mention the breaking change in the PR description (it's a breaking change but I don't expect downstream users to have actually raised from it)


5. No unit tests for xet_connection_info_refresh_url

This is a new public-ish function with non-trivial logic (bucket vs git-based repo types, the "/None" revision for PR tokens). There are no dedicated tests for it. A few parametrized test cases would be cheap and valuable:

  • repo_type="model", revision="main".../models/{id}/xet-write-token/main
  • repo_type="model", revision=None.../models/{id}/xet-write-token/None
  • repo_type="bucket", revision=None.../buckets/{id}/xet-write-token (no revision)
  • repo_type="bucket", revision="some-rev".../buckets/{id}/xet-write-token/some-rev

My opinion (Wauplin): quick and easy to test so let's do it (even if not introduced in this PR)


6. progress.close(False) always passes False — is _success meaningful?

Files: _commit_api.py:632, hf_api.py:13193

XetProgressReporter.close(_success) is always called with False (in the finally block), even on successful completion. The _success parameter is currently unused in close() — it just closes bars regardless. If _success is never meaningful, the parameter should be removed. If it's intended for future use, the call site should pass the correct value.

This existed before this PR but worth flagging since the progress reporter was rewritten.

My opinion (Wauplin): let's keep it as-is except if you really care about it^^


8. Header stripping pattern repeated

The xet_headers = headers.copy(); xet_headers.pop("authorization", None) pattern appears in 4 call sites:

  • _commit_api.py:602-603
  • file_download.py:537-538
  • hf_api.py:13151-13152
  • hf_api.py:13397-13398

Consider a small helper like _strip_auth_headers(headers) if you want, but it's fine as-is — it's 2 lines and very clear.

My opinion (Wauplin): why not but low prio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants