Skip to content

Add FsspecJsonWSIReader class. #897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
Mar 7, 2025

Conversation

aacic
Copy link
Collaborator

@aacic aacic commented Dec 9, 2024

The FsspecJsonWSIReader reads fsspec json file which represents SVS or TIFF whole slide image. The images are accessible by HTTP range requests, eg:

https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27

The whole image can be downloaded like:

curl -C - -o TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27

The FsspecJsonWSIReader class has a _zarr_store field which is created by reading json file using fsspec:

mapper = fsspec.get_mapper(
            "reference://", fo=str(input_img), target_protocol="file"
        )
self._zarr_array = zarr.open(mapper, mode="r")
self._zarr_store = self._zarr_array.store

self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size)
self._zarr_group = zarr.open(self._zarr_lru_cache)

This is equivalent to TIFFWSIReader code:

self._zarr_store = tifffile.imread(
            self.input_path,
            series=self.series_n,
            aszarr=True,
        )
        self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size)
        self._zarr_group = zarr.open(self._zarr_lru_cache)

Both FsspecJsonWSIReader and TIFFWSIReader forward calls to read_bounds and read_rect methods of theTIFFWSIReaderDelegate delegate instance.

The method _info of theTIFFWSIReaderDelegate reads SVS metadata which is stored in the root group metadata like:

{
  ".zattrs": {
    "multiscales": [
      {
        "metadata": {
          "objective_power": 40,
          "vendor": "Aperio",
          "mpp": [0.2525, 0.2525]	
        }
      }
    ]
  }
}

To test, execute from the root dir:

pip install -r requirements/requirements_dev.txt
mkdir -p samples/slides
mkdir -p samples/fsspec
cd samples/slides
curl -C - -o TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs   https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27
cd ../../
cp tiatoolbox/utils/tiff_to_fsspec.py .
python tiff_to_fsspec.py "samples/slides/TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs"  "samples/fsspec/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27_fsspec.json" "https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27"

Create tileserver.py inside of the project root:

from flask_cors import CORS

from tiatoolbox.visualization import TileServer
from tiatoolbox.wsicore.wsireader import FsspecJsonWSIReader

wsi = FsspecJsonWSIReader.open(
    "./samples/fsspec/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27_fsspec.json"
)


# Initialize and run the TileServer
tile_server = TileServer(
    title="Tiatoolbox TileServer",
    layers={"layer": wsi},
)
CORS(tile_server, send_wildcard=True)


tile_server.run(host="127.0.0.1", port=5000)

Open http://127.0.0.1:5000/ and verify that it works.

@aacic aacic changed the title Add ZarrTIFFWSIReader class. WIP: Add ZarrTIFFWSIReader class. Dec 9, 2024
@aacic aacic force-pushed the zarr-tiff-wsi-reader branch from 3805565 to cbd657f Compare December 12, 2024 13:59
@aacic aacic force-pushed the zarr-tiff-wsi-reader branch from bfa8b4b to 1bc2356 Compare December 16, 2024 16:51
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

codecov bot commented Jan 3, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.93%. Comparing base (2416ba9) to head (3a78386).
Report is 16 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #897      +/-   ##
===========================================
+ Coverage    99.90%   99.93%   +0.02%     
===========================================
  Files           70       71       +1     
  Lines         8736     8834      +98     
  Branches      1149     1152       +3     
===========================================
+ Hits          8728     8828     +100     
  Misses           3        3              
+ Partials         5        3       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Args:
path (Path): Path to the file to check.

# TODO extend logic and verify that json file is a fsspec tiff file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a link to tiff-fsspec generator file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method is migrated to:
FsspecJsonWSIReader.is_valid_zarr_fsspec(file_path: str) -> bool:

I've extended the docs and added a link to tiff-to-fsspec generator file.

@@ -4225,6 +4246,528 @@ class docstrings for more information.
return im_region


class ZarrTIFFWSIReader(WSIReader):
"""Define Zarr Tiff WSI Reader."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some documentation / introduction about the fsspec / json files.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added documentation and the link to tiff_to_fsspec.py.

@aacic aacic mentioned this pull request Jan 23, 2025
3 tasks
@shaneahmed shaneahmed added the enhancement New feature or request label Jan 24, 2025
@shaneahmed shaneahmed added this to the Release v1.7.0 milestone Jan 24, 2025
@aacic aacic requested a review from shaneahmed February 11, 2025 17:35
@aacic aacic marked this pull request as ready for review February 11, 2025 17:35
@aacic aacic changed the title WIP: Add FsspecJsonWSIReader class. Add FsspecJsonWSIReader class. Feb 12, 2025
aacic and others added 6 commits February 17, 2025 11:33
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
Copy link
Member

@shaneahmed shaneahmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aacic

@shaneahmed shaneahmed merged commit 5a90a26 into TissueImageAnalytics:develop Mar 7, 2025
14 checks passed
mbasheer04 added a commit to mbasheer04/tiatoolbox that referenced this pull request Mar 27, 2025
Removing large files

Removing unneccessary files

Fixing pre-commit

[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Fixing PR issues

:technologist: pre-commit autoupdate (TissueImageAnalytics#910)

* 🧑‍💻 pre-commit autoupdate

updates:
- [github.com/executablebooks/mdformat: 0.7.21 → 0.7.22](hukkin/mdformat@0.7.21...0.7.22)
- [github.com/astral-sh/ruff-pre-commit: v0.8.6 → v0.9.4](astral-sh/ruff-pre-commit@v0.8.6...v0.9.4)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 📌 Update `ruff` dependency

* 🔥 TIAToolbox does not support Python > 3.12 yet
- There is no need for this check as this will be tested while upgrading to Python 3.13

* ♻️ Refactor `typing` to `type_hints`.

* 🐛 Fix `mypy` workflow

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>

📝 Update Documentation Structure (TissueImageAnalytics#909)

- Use `Python 3.12` for docs build
- Update `copyright` year to `2025`
- Landing page now shows text from README
- Update documentation structure
- Update `readthedocs` Build
- Remove `usage.rst`
- Rename Jupyter Notebooks to Usage Examples
- Show README for Usage Examples instead of TOC
- Reduce TOC depth for basic functionalities and pipelines
- Improve `README` quality.

🐛 Fix in `test_arch_mapde` and `test_arch_sccnn` (TissueImageAnalytics#911)

- If cuda is available model should be moved to cuda otherwise tests will fail as test data is moved to cuda.

[skip ci] 📝 Improve Documentation (TissueImageAnalytics#913)

- Update CONTRIBUTING.rst
- Bug fix in conf.py to fix notebook links
- Update `examples/README.md`
- Update `docs/installation.rst`
- Update `docs/visualization.rst`

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: adamshephard <39619155+adamshephard@users.noreply.github.com>

:bug: Fix `MapDe` `dist_filter` Shape  (TissueImageAnalytics#914)

- Fix `dist_filter` in `MapDe` model for multi-class output.

Explanation:
Previously, if we set `num_class` to more than 1, the model would still output 1 channel. This was because the `dist_filter` always had size of 1 in its first dimension, however the first dimension determines the number of output channels in the tensor produced by `torch.functional.F.conv2d`.
This PR changes this by repeating the filters the match the number of output classes.

:technologist: pre-commit autoupdate (TissueImageAnalytics#916)

* 🧑‍💻 pre-commit autoupdate

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.9.4 → v0.9.9](astral-sh/ruff-pre-commit@v0.9.4...v0.9.9)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* 🔨 Update `ruff` version

* 🔨 Update noqa for Unused static method argument

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>

Add FsspecJsonWSIReader class. (TissueImageAnalytics#897)

The `FsspecJsonWSIReader` reads fsspec json file which represents SVS or TIFF whole slide image. The images are accessible by HTTP range requests, eg:

`https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27`

The whole image can be downloaded like:

`curl -C - -o TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs   https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27`

The `FsspecJsonWSIReader` class has a `_zarr_store` field which is created by reading json file using `fsspec`:

```
mapper = fsspec.get_mapper(
            "reference://", fo=str(input_img), target_protocol="file"
        )
self._zarr_array = zarr.open(mapper, mode="r")
self._zarr_store = self._zarr_array.store

self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size)
self._zarr_group = zarr.open(self._zarr_lru_cache)
```

This is equivalent to `TIFFWSIReader` code:

```
self._zarr_store = tifffile.imread(
            self.input_path,
            series=self.series_n,
            aszarr=True,
        )
        self._zarr_lru_cache = zarr.LRUStoreCache(self._zarr_store, max_size=cache_size)
        self._zarr_group = zarr.open(self._zarr_lru_cache)
```

Both FsspecJsonWSIReader and TIFFWSIReader forward calls to `read_bounds` and  `read_rect` methods of the`TIFFWSIReaderDelegate` delegate instance.

The method `_info` of the`TIFFWSIReaderDelegate` reads SVS metadata which is stored in the root group metadata like:
```
{
  ".zattrs": {
    "multiscales": [
      {
        "metadata": {
          "objective_power": 40,
          "vendor": "Aperio",
          "mpp": [0.2525, 0.2525]
        }
      }
    ]
  }
}
```

To test, execute from the root dir:
```
pip install -r requirements/requirements_dev.txt
mkdir -p samples/slides
mkdir -p samples/fsspec
cd samples/slides
curl -C - -o TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs   https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27
cd ../../
cp tiatoolbox/utils/tiff_to_fsspec.py .
python tiff_to_fsspec.py "samples/slides/TCGA-22-1017-01Z-00-DX1.9562FE79-A261-42D3-B394-F3E0E2FF7DDA.svs"  "samples/fsspec/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27_fsspec.json" "https://api.gdc.cancer.gov/data/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27"
```

Create `tileserver.py` inside of the project root:

```
from flask_cors import CORS

from tiatoolbox.visualization import TileServer
from tiatoolbox.wsicore.wsireader import FsspecJsonWSIReader

wsi = FsspecJsonWSIReader.open(
    "./samples/fsspec/73c69d24-6f9e-44e2-bfe5-a608d4cf5c27_fsspec.json"
)

tile_server = TileServer(
    title="Tiatoolbox TileServer",
    layers={"layer": wsi},
)
CORS(tile_server, send_wildcard=True)

tile_server.run(host="127.0.0.1", port=5000)
```

Open `http://127.0.0.1:5000/` and verify that it works.
---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>

✨ Support for Additional Foundation Models (TissueImageAnalytics#906)

- Add support for additional foundation models as feature extractors using the TimmBackbone.
- Added models include: UNI2, Virchow, Virchow2, kaiko and H-optimus-1.
- Add more information to docstrings.
- Allow foundation models with additional parameters.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>

Fixing PR issues
shaneahmed pushed a commit that referenced this pull request Apr 25, 2025
- Unnecessary logging was introduced in the #897
aacic added a commit to aacic/tiatoolbox that referenced this pull request May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants