Skip to content

Commit f008cc6

Browse files
committed
merge upstream/main and add graceful fallback for lazy reading
Merge 26 upstream commits including the xarray-schema removal, validate() classmethod refactor, and from __future__ import annotations. Also adds a graceful fallback in _read_table(): if read_lazy() fails (e.g. due to missing encoding metadata on obs/var columns written by third-party tools), warn and fall back to eager reading instead of crashing.
2 parents cd2dd56 + 094b869 commit f008cc6

101 files changed

Lines changed: 1636 additions & 444 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/settings.local.json

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
{
2+
"permissions": {
3+
"allow": [
4+
"Bash(gh pr view 1055 --repo scverse/spatialdata --json title,state,url,headRefName,body)",
5+
"Bash(python -c \":*)",
6+
"Bash(python -m pytest tests/io/test_readwrite.py -x -v --tb=short)",
7+
"Bash(python -m pytest tests/io/test_readwrite.py -x -v --tb=short -k \"lazy\")",
8+
"Bash(python -m pytest tests/io/test_readwrite.py::TestReadWrite::test_io_and_lazy_loading_raster -v --tb=long -k \"sdata_container_format1\")",
9+
"Bash(python -m pytest tests/io/test_readwrite.py -v --tb=short -k \"TestLazyTableLoading\")",
10+
"Bash(python -m pytest tests/core/query/ -v --tb=short)",
11+
"Bash(python -W all -c \":*)"
12+
]
13+
}
14+
}

.github/workflows/release.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ jobs:
99
runs-on: ubuntu-latest
1010
if: startsWith(github.ref, 'refs/tags/v')
1111
steps:
12-
- uses: actions/checkout@v3
12+
- uses: actions/checkout@v6
1313
- name: Set up Python 3.12
14-
uses: actions/setup-python@v4
14+
uses: actions/setup-python@v6
1515
with:
1616
python-version: "3.12"
1717
cache: pip

.github/workflows/test.yaml

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,13 @@ jobs:
1313
runs-on: ${{ matrix.os }}
1414
defaults:
1515
run:
16-
shell: bash -e {0}
16+
shell: bash # bash also on windows
1717

1818
strategy:
1919
fail-fast: false
2020
matrix:
2121
include:
22-
- {os: windows-latest, python: "3.11", dask-version: "2025.2.0", name: "Dask 2025.2.0"}
22+
- {os: windows-latest, python: "3.11", dask-version: "2025.12.0", name: "Dask 2025.12.0"}
2323
- {os: windows-latest, python: "3.13", dask-version: "latest", name: "Dask latest"}
2424
- {os: ubuntu-latest, python: "3.11", dask-version: "latest", name: "Dask latest"}
2525
- {os: ubuntu-latest, python: "3.13", dask-version: "latest", name: "Dask latest"}
@@ -32,29 +32,27 @@ jobs:
3232
PRERELEASE: ${{ matrix.prerelease }}
3333

3434
steps:
35-
- uses: actions/checkout@v2
36-
- uses: astral-sh/setup-uv@v5
35+
- uses: actions/checkout@v6
36+
- uses: astral-sh/setup-uv@v7
3737
id: setup-uv
3838
with:
3939
version: "latest"
4040
python-version: ${{ matrix.python }}
4141
- name: Install dependencies
4242
run: |
4343
if [[ "${PRERELEASE}" == "allow" ]]; then
44-
uv sync --extra test
45-
: # uv sync --extra test --prerelease ${PRERELEASE}
46-
uv pip install git+https://github.com/scverse/anndata.git
47-
uv pip install --prerelease allow pandas
48-
else
49-
uv sync --extra test
44+
sed -i '' 's/requires-python.*//' pyproject.toml # otherwise uv complains that anndata requires python>=3.12 and we only do >=3.11 😱
45+
uv add git+https://github.com/scverse/anndata.git
46+
uv add pandas>=3.dev0
5047
fi
5148
if [[ -n "${DASK_VERSION}" ]]; then
5249
if [[ "${DASK_VERSION}" == "latest" ]]; then
53-
uv pip install --upgrade dask
50+
uv add dask
5451
else
55-
uv pip install dask==${DASK_VERSION}
52+
uv add dask==${DASK_VERSION}
5653
fi
5754
fi
55+
uv sync --group=test
5856
- name: Test
5957
env:
6058
MPLBACKEND: agg
@@ -63,7 +61,7 @@ jobs:
6361
run: |
6462
uv run pytest --cov --color=yes --cov-report=xml
6563
- name: Upload coverage to Codecov
66-
uses: codecov/codecov-action@v4
64+
uses: codecov/codecov-action@v5
6765
with:
6866
name: coverage
6967
verbose: true

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ __pycache__/
2020
docs/_build
2121
!docs/api/.md
2222
docs/**/generated
23+
docs/_static/datasets_data.js
2324

2425
# IDEs
2526
/.idea/

.pre-commit-config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ci:
99
skip: []
1010
repos:
1111
- repo: https://github.com/rbubley/mirrors-prettier
12-
rev: v3.7.4
12+
rev: v3.8.1
1313
hooks:
1414
- id: prettier
1515
exclude: ^.github/workflows/test.yaml
@@ -20,7 +20,7 @@ repos:
2020
additional_dependencies: [numpy, types-requests]
2121
exclude: tests/|docs/
2222
- repo: https://github.com/astral-sh/ruff-pre-commit
23-
rev: v0.14.10
23+
rev: v0.15.2
2424
hooks:
2525
- id: ruff
2626
args: [--fix, --exit-non-zero-on-fix]

.readthedocs.yaml

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,22 @@
11
# https://docs.readthedocs.io/en/stable/config-file/v2.html
22
version: 2
33
build:
4-
os: ubuntu-20.04
4+
os: ubuntu-24.04
55
tools:
6-
python: "3.11"
7-
sphinx:
8-
configuration: docs/conf.py
9-
fail_on_warning: true
10-
python:
11-
install:
12-
- method: pip
13-
path: .
14-
extra_requirements:
15-
- docs
16-
- torch
6+
python: "3.13"
7+
jobs:
8+
post_checkout:
9+
# unshallow so version can be derived from tag
10+
- git fetch --unshallow || true
11+
create_environment:
12+
- asdf plugin add uv
13+
- asdf install uv latest
14+
- asdf global uv latest
15+
build:
16+
html:
17+
- uv sync --group=docs --extra=torch
18+
- uv run make --directory=docs html
19+
- mv docs/_build $READTHEDOCS_OUTPUT
1720
submodules:
1821
include:
1922
- "docs/tutorials/notebooks"

benchmarks/README.md

Lines changed: 49 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -14,25 +14,63 @@ pip install -e '.[docs,test,benchmark]'
1414

1515
## Usage
1616

17-
Running all the benchmarks is usually not needed. You run the benchmark using `asv run`. See the [asv documentation](https://asv.readthedocs.io/en/stable/commands.html#asv-run) for interesting arguments, like selecting the benchmarks you're interested in by providing a regex pattern `-b` or `--bench` that links to a function or class method e.g. the option `-b timeraw_import_inspect` selects the function `timeraw_import_inspect` in `benchmarks/spatialdata_benchmark.py`. You can run the benchmark in your current environment with `--python=same`. Some example benchmarks:
17+
Running all the benchmarks is usually not needed. You run the benchmark using `asv run`. See the [asv documentation](https://asv.readthedocs.io/en/stable/commands.html#asv-run) for interesting arguments, like selecting the benchmarks you're interested in by providing a regex pattern `-b` or `--bench` that links to a function or class method. You can run the benchmark in your current environment with `--python=same`. Some example benchmarks:
1818

19-
Importing the SpatialData library can take around 4 seconds:
19+
### Import time benchmarks
20+
21+
Import benchmarks live in `benchmarks/benchmark_imports.py`. Each `timeraw_*` function returns a Python code snippet that asv runs in a fresh interpreter (cold import, empty module cache):
22+
23+
Run all import benchmarks in your current environment:
2024

2125
```
22-
PYTHONWARNINGS="ignore" asv run --python=same --show-stderr -b timeraw_import_inspect
23-
Couldn't load asv.plugins._mamba_helpers because
24-
No module named 'conda'
25-
· Discovering benchmarks
26-
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
27-
[ 0.00%] ·· Benchmarking existing-py_opt_homebrew_Caskroom_mambaforge_base_envs_spatialdata2_bin_python3.12
28-
[50.00%] ··· Running (spatialdata_benchmark.timeraw_import_inspect--).
29-
[100.00%] ··· spatialdata_benchmark.timeraw_import_inspect 3.65±0.2s
26+
asv run --python=same --show-stderr -b timeraw
27+
```
28+
29+
Or a single one:
30+
31+
```
32+
asv run --python=same --show-stderr -b timeraw_import_spatialdata
33+
```
34+
35+
### Comparing the current branch against `main`
36+
37+
The simplest way is `asv continuous`, which builds both commits, runs the benchmarks, and prints the comparison in one shot:
38+
39+
```bash
40+
asv continuous --show-stderr -v -b timeraw main faster-import
3041
```
3142

43+
Replace `faster-import` with any branch name or commit hash. The `-v` flag prints per-sample timings; drop it for a shorter summary.
44+
45+
Alternatively, collect results separately and compare afterwards:
46+
47+
```bash
48+
# 1. Collect results for the tip of main and the tip of your branch
49+
asv run --show-stderr -b timeraw main
50+
asv run --show-stderr -b timeraw HEAD
51+
52+
# 2. Print a side-by-side comparison
53+
asv compare main HEAD
54+
```
55+
56+
Both approaches build isolated environments from scratch. If you prefer to skip the rebuild and reuse your current environment (faster, less accurate):
57+
58+
```bash
59+
asv run --python=same --show-stderr -b timeraw HEAD
60+
61+
git stash && git checkout main
62+
asv run --python=same --show-stderr -b timeraw HEAD
63+
git checkout - && git stash pop
64+
65+
asv compare main HEAD
66+
```
67+
68+
### Querying benchmarks
69+
3270
Querying using a bounding box without a spatial index is highly impacted by large amounts of points (transcripts), more than table rows (cells).
3371

3472
```
35-
$ PYTHONWARNINGS="ignore" asv run --python=same --show-stderr -b time_query_bounding_box
73+
$ asv run --python=same --show-stderr -b time_query_bounding_box
3674
3775
[100.00%] ··· ======== ============ ============= ============= ==============
3876
-- filter_table / n_transcripts_per_cell

benchmarks/benchmark_imports.py

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
"""Benchmarks for import times of the spatialdata package and its submodules.
2+
3+
Each ``timeraw_*`` function returns a snippet of Python code that asv runs in
4+
a fresh interpreter, so the measured time reflects a cold import with an empty
5+
module cache.
6+
"""
7+
8+
from collections.abc import Callable
9+
from typing import Any
10+
11+
12+
def _timeraw(func: Any) -> Any:
13+
"""Set asv benchmark attributes for a cold-import timeraw function."""
14+
func.repeat = 5 # number of independent subprocess measurements
15+
func.number = 1 # must be 1: second import in same process hits module cache
16+
return func
17+
18+
19+
@_timeraw
20+
def timeraw_import_spatialdata() -> str:
21+
"""Time a bare ``import spatialdata``."""
22+
return """
23+
import spatialdata
24+
"""
25+
26+
27+
@_timeraw
28+
def timeraw_import_SpatialData() -> str:
29+
"""Time importing the top-level ``SpatialData`` class."""
30+
return """
31+
from spatialdata import SpatialData
32+
"""
33+
34+
35+
@_timeraw
36+
def timeraw_import_read_zarr() -> str:
37+
"""Time importing ``read_zarr`` from the top-level namespace."""
38+
return """
39+
from spatialdata import read_zarr
40+
"""
41+
42+
43+
@_timeraw
44+
def timeraw_import_models_elements() -> str:
45+
"""Time importing the main element model classes."""
46+
return """
47+
from spatialdata.models import Image2DModel, Labels2DModel, PointsModel, ShapesModel, TableModel
48+
"""
49+
50+
51+
@_timeraw
52+
def timeraw_import_transformations() -> str:
53+
"""Time importing the ``spatialdata.transformations`` submodule."""
54+
return """
55+
from spatialdata.transformations import Affine, Scale, Translation, Sequence
56+
"""

benchmarks/spatialdata_benchmark.py

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,6 @@ def peakmem_list2(self):
2020
return sdata
2121

2222

23-
def timeraw_import_inspect():
24-
"""Time the import of the spatialdata module."""
25-
return """
26-
import spatialdata
27-
"""
28-
29-
3023
class TimeMapRaster:
3124
"""Time the."""
3225

docs/conf.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44
# list see the documentation:
55
# https://www.sphinx-doc.org/en/master/usage/configuration.html
66

7+
from __future__ import annotations
8+
79
# -- Path setup --------------------------------------------------------------
810
import sys
911
from datetime import datetime
@@ -12,6 +14,7 @@
1214

1315
HERE = Path(__file__).parent
1416
sys.path.insert(0, str(HERE / "extensions"))
17+
sys.path.insert(0, str(HERE / "tutorials" / "notebooks" / "extensions"))
1518

1619

1720
# -- Project information -----------------------------------------------------
@@ -57,6 +60,7 @@
5760
"IPython.sphinxext.ipython_console_highlighting",
5861
"sphinx_design",
5962
*[p.stem for p in (HERE / "extensions").glob("*.py")],
63+
*[p.stem for p in (HERE / "tutorials" / "notebooks" / "extensions").glob("*.py")],
6064
]
6165

6266
autodoc_default_options = {
@@ -123,6 +127,7 @@
123127
"tutorials/notebooks/notebooks/developers_resources/storage_format/Readme.md",
124128
"tutorials/notebooks/notebooks/examples/technology_stereoseq.ipynb",
125129
"tutorials/notebooks/notebooks/examples/technology_curio.ipynb",
130+
"tutorials/notebooks/notebooks/examples/technology_cosmx.ipynb",
126131
"tutorials/notebooks/notebooks/examples/stereoseq_data/*",
127132
]
128133
# Ignore warnings.

0 commit comments

Comments
 (0)