Add FsspecJsonWSIReader class. #897

aacic · 2024-12-09T16:24:35Z

The FsspecJsonWSIReader reads fsspec json file which represents SVS or TIFF whole slide image. The images are accessible by HTTP range requests, eg:

https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27

The whole image can be downloaded like:

curl -C - -o TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27

The FsspecJsonWSIReader class has a _zarr_store field which is created by reading json file using fsspec:

mapper = fsspec.get_mapper(
            "reference://", fo=str(input_img), target_protocol="file"
        )
self._zarr_array = zarr.open(mapper, mode="r")
self._zarr_store = self._zarr_array.store

self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size)
self._zarr_group = zarr.open(self._zarr_lru_cache)

This is equivalent to TIFFWSIReader code:

self._zarr_store = tifffile.imread(
            self.input_path,
            series=self.series_n,
            aszarr=True,
        )
        self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size)
        self._zarr_group = zarr.open(self._zarr_lru_cache)

Both FsspecJsonWSIReader and TIFFWSIReader forward calls to read_bounds and read_rect methods of theTIFFWSIReaderDelegate delegate instance.

The method _info of theTIFFWSIReaderDelegate reads SVS metadata which is stored in the root group metadata like:

{
  ".zattrs": {
    "multiscales": [
      {
        "metadata": {
          "objective_power": 40,
          "vendor": "Aperio",
          "mpp": [0.2525, 0.2525]	
        }
      }
    ]
  }
}

To test, execute from the root dir:

pip install -r requirements/requirements_dev.txt
mkdir -p samples/slides
mkdir -p samples/fsspec
cd samples/slides
curl -C - -o TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs   https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27
cd ../../
cp tiatoolbox/utils/tiff_to_fsspec.py .
python tiff_to_fsspec.py "samples/slides/TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs"  "samples/fsspec/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27_fsspec.json" "https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27"

Create tileserver.py inside of the project root:

from flask_cors import CORS

from tiatoolbox.visualization import TileServer
from tiatoolbox.wsicore.wsireader import FsspecJsonWSIReader

wsi = FsspecJsonWSIReader.open(
    "./samples/fsspec/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27_fsspec.json"
)


# Initialize and run the TileServer
tile_server = TileServer(
    title="Tiatoolbox TileServer",
    layers={"layer": wsi},
)
CORS(tile_server, send_wildcard=True)


tile_server.run(host="127.0.0.1", port=5000)

Open http://127.0.0.1:5000/ and verify that it works.

for more information, see https://pre-commit.ci

review-notebook-app · 2024-12-16T16:53:01Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

requirements/requirements.txt

codecov · 2025-01-03T11:38:58Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.93%. Comparing base (2416ba9) to head (3a78386).
Report is 16 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #897      +/-   ##
===========================================
+ Coverage    99.90%   99.93%   +0.02%     
===========================================
  Files           70       71       +1     
  Lines         8736     8834      +98     
  Branches      1149     1152       +3     
===========================================
+ Hits          8728     8828     +100     
  Misses           3        3              
+ Partials         5        3       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

shaneahmed · 2025-01-14T16:19:48Z

tiatoolbox/wsicore/wsireader.py

+    Args:
+        path (Path): Path to the file to check.
+
+    # TODO extend logic and verify that json file is a fsspec tiff file


Add a link to tiff-fsspec generator file.

The method is migrated to:
FsspecJsonWSIReader.is_valid_zarr_fsspec(file_path: str) -> bool:

I've extended the docs and added a link to tiff-to-fsspec generator file.

tiatoolbox/wsicore/wsireader.py

shaneahmed · 2025-01-14T16:22:49Z

tiatoolbox/wsicore/wsireader.py

@@ -4225,6 +4246,528 @@ class docstrings for more information.
        return im_region


+class ZarrTIFFWSIReader(WSIReader):
+    """Define Zarr Tiff WSI Reader."""


Add some documentation / introduction about the fsspec / json files.

Added documentation and the link to tiff_to_fsspec.py.

for more information, see https://pre-commit.ci

tiatoolbox/utils/tiff_to_fsspec.py

tiatoolbox/wsicore/wsireader.py

tests/test_wsireader.py

Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>

for more information, see https://pre-commit.ci

shaneahmed

Thanks @aacic

Removing large files Removing unneccessary files Fixing pre-commit [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Fixing PR issues :technologist: pre-commit autoupdate (TissueImageAnalytics#910) * 🧑‍💻 pre-commit autoupdate updates: - [github.com/executablebooks/mdformat: 0.7.21 → 0.7.22](hukkin/mdformat@0.7.21...0.7.22) - [github.com/astral-sh/ruff-pre-commit: v0.8.6 → v0.9.4](astral-sh/ruff-pre-commit@v0.8.6...v0.9.4) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * 📌 Update `ruff` dependency * 🔥 TIAToolbox does not support Python > 3.12 yet - There is no need for this check as this will be tested while upgrading to Python 3.13 * ♻️ Refactor `typing` to `type_hints`. * 🐛 Fix `mypy` workflow --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com> 📝 Update Documentation Structure (TissueImageAnalytics#909) - Use `Python 3.12` for docs build - Update `copyright` year to `2025` - Landing page now shows text from README - Update documentation structure - Update `readthedocs` Build - Remove `usage.rst` - Rename Jupyter Notebooks to Usage Examples - Show README for Usage Examples instead of TOC - Reduce TOC depth for basic functionalities and pipelines - Improve `README` quality. 🐛 Fix in `test_arch_mapde` and `test_arch_sccnn` (TissueImageAnalytics#911) - If cuda is available model should be moved to cuda otherwise tests will fail as test data is moved to cuda. [skip ci] 📝 Improve Documentation (TissueImageAnalytics#913) - Update CONTRIBUTING.rst - Bug fix in conf.py to fix notebook links - Update `examples/README.md` - Update `docs/installation.rst` - Update `docs/visualization.rst` --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: adamshephard <39619155+adamshephard@users.noreply.github.com> :bug: Fix `MapDe` `dist_filter` Shape (TissueImageAnalytics#914) - Fix `dist_filter` in `MapDe` model for multi-class output. Explanation: Previously, if we set `num_class` to more than 1, the model would still output 1 channel. This was because the `dist_filter` always had size of 1 in its first dimension, however the first dimension determines the number of output channels in the tensor produced by `torch.functional.F.conv2d`. This PR changes this by repeating the filters the match the number of output classes. :technologist: pre-commit autoupdate (TissueImageAnalytics#916) * 🧑‍💻 pre-commit autoupdate updates: - [github.com/astral-sh/ruff-pre-commit: v0.9.4 → v0.9.9](astral-sh/ruff-pre-commit@v0.9.4...v0.9.9) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * 🔨 Update `ruff` version * 🔨 Update noqa for Unused static method argument --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com> Add FsspecJsonWSIReader class. (TissueImageAnalytics#897) The `FsspecJsonWSIReader` reads fsspec json file which represents SVS or TIFF whole slide image. The images are accessible by HTTP range requests, eg: `https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27` The whole image can be downloaded like: `curl -C - -o TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27` The `FsspecJsonWSIReader` class has a `_zarr_store` field which is created by reading json file using `fsspec`: ``` mapper = fsspec.get_mapper( "reference://", fo=str(input_img), target_protocol="file" ) self._zarr_array = zarr.open(mapper, mode="r") self._zarr_store = self._zarr_array.store self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size) self._zarr_group = zarr.open(self._zarr_lru_cache) ``` This is equivalent to `TIFFWSIReader` code: ``` self._zarr_store = tifffile.imread( self.input_path, series=self.series_n, aszarr=True, ) self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size) self._zarr_group = zarr.open(self._zarr_lru_cache) ``` Both FsspecJsonWSIReader and TIFFWSIReader forward calls to `read_bounds` and `read_rect` methods of the`TIFFWSIReaderDelegate` delegate instance. The method `_info` of the`TIFFWSIReaderDelegate` reads SVS metadata which is stored in the root group metadata like: ``` { ".zattrs": { "multiscales": [ { "metadata": { "objective_power": 40, "vendor": "Aperio", "mpp": [0.2525, 0.2525] } } ] } } ``` To test, execute from the root dir: ``` pip install -r requirements/requirements_dev.txt mkdir -p samples/slides mkdir -p samples/fsspec cd samples/slides curl -C - -o TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27 cd ../../ cp tiatoolbox/utils/tiff_to_fsspec.py . python tiff_to_fsspec.py "samples/slides/TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs" "samples/fsspec/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27_fsspec.json" "https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27" ``` Create `tileserver.py` inside of the project root: ``` from flask_cors import CORS from tiatoolbox.visualization import TileServer from tiatoolbox.wsicore.wsireader import FsspecJsonWSIReader wsi = FsspecJsonWSIReader.open( "./samples/fsspec/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27_fsspec.json" ) tile_server = TileServer( title="Tiatoolbox TileServer", layers={"layer": wsi}, ) CORS(tile_server, send_wildcard=True) tile_server.run(host="127.0.0.1", port=5000) ``` Open `http://127.0.0.1:5000/` and verify that it works. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com> ✨ Support for Additional Foundation Models (TissueImageAnalytics#906) - Add support for additional foundation models as feature extractors using the TimmBackbone. - Added models include: UNI2, Virchow, Virchow2, kaiko and H-optimus-1. - Add more information to docstrings. - Allow foundation models with additional parameters. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com> Fixing PR issues

- Unnecessary logging was introduced in the #897

- Unnecessary logging was introduced in the TissueImageAnalytics#897

aacic changed the title ~~Add ZarrTIFFWSIReader class.~~ WIP: Add ZarrTIFFWSIReader class. Dec 9, 2024

aacic force-pushed the zarr-tiff-wsi-reader branch from 3805565 to cbd657f Compare December 12, 2024 13:59

Add ZarrTIFFWSIReader class.

1bc2356

aacic force-pushed the zarr-tiff-wsi-reader branch from bfa8b4b to 1bc2356 Compare December 16, 2024 16:51

[pre-commit.ci] auto fixes from pre-commit.com hooks

adb3574

for more information, see https://pre-commit.ci

Merge branch 'develop' into zarr-tiff-wsi-reader

1ce17b2

shaneahmed reviewed Jan 3, 2025

View reviewed changes

requirements/requirements.txt Show resolved Hide resolved

Merge branch 'develop' into zarr-tiff-wsi-reader

dc7c77c

shaneahmed reviewed Jan 14, 2025

View reviewed changes

tiatoolbox/wsicore/wsireader.py Outdated Show resolved Hide resolved

shaneahmed reviewed Jan 14, 2025

View reviewed changes

aacic mentioned this pull request Jan 23, 2025

GDC WSI support. stjude/proteinpaint#2644

Merged

3 tasks

Merge branch 'develop' into zarr-tiff-wsi-reader

9276647

shaneahmed added the enhancement New feature or request label Jan 24, 2025

shaneahmed added this to the Release v1.7.0 milestone Jan 24, 2025

aacic and others added 13 commits January 31, 2025 17:50

Rename ZarrTIFFWSIReader to FsspecJsonReader

0cf8c32

Rename ZarrTIFFWSIReader to FsspecJsonReader

e750f2a

Rename ZarrTIFFWSIReader to FsspecJsonWSIReader.

0209100

Rename ZarrTIFFWSIReader to FsspecJsonWSIReader.

7b6a7b1

Migrate tiff_fsspec.py.

2034262

Migrate is_valid_zarr_fsspec.

91d5911

Rename ZarrTIFFWSIReader to FsspecJsonWSIReader.

224de85

Migrate is_valid_zarr_fsspec.

1bf88bc

Fix loggin issue.

e3d6129

Add Jpeg2k codec.

ad8ed13

WIP: Add DelegateWSIReader.

ad559aa

Merge branch 'develop' into zarr-tiff-wsi-reader

afdf912

[pre-commit.ci] auto fixes from pre-commit.com hooks

ebd41da

for more information, see https://pre-commit.ci

aacic and others added 6 commits February 10, 2025 17:04

Add more tests.

6d5a372

Add more tests.

3bb9257

Merge branch 'develop' into zarr-tiff-wsi-reader

fd85e9a

Clean up tests.

6008f4c

Update docs.

4f3c1ad

Update docs.

e902766

aacic requested a review from shaneahmed February 11, 2025 17:35

aacic marked this pull request as ready for review February 11, 2025 17:35

aacic changed the title ~~WIP: Add FsspecJsonWSIReader class.~~ Add FsspecJsonWSIReader class. Feb 12, 2025