[help wanted] FileNotFoundError: [Errno 2] No such file or directory for --sopa resolve baysor #161

KunHHE · 2024-12-03T03:09:41Z

Hi,
Dear community, I finished Baysor segmentation and tried to run --sopa resolve baysor C:/Users/hekun/Downloads/S3R1.zarr --gene-column genes; But an error showed up: FileNotFoundError: [Errno 2] No such file or directory:
'C:\Users\hekun\Downloads\S3R1.zarr\.sopa_cache\baysor_boundaries\0\segmentation_polygons.json'

I know there's one report in #152 report. Yes, I did notice the output for the JSON file is segmentation_polygons_2d, but not segmentation_polygons. I guess renaming all the patch folders won't help? Any suggestions?

Thanks very very much!

quentinblampey · 2024-12-03T12:17:06Z

Hello @KunHHE,

This is an issue that should be fixed in the next release (sopa==2.0.0), which should be released very soon. Actually, the version is ready, but I need the new version of spatialdata to be released, which is expected this week.

I'll let you know when it's released!

KunHHE · 2024-12-09T15:19:53Z

Dear @quentinblampey, can I ask you the new version of sopa==2.0.0? Thank you!!!

quentinblampey · 2024-12-09T15:30:10Z

Hello @KunHHE, it's still not released, I'm waiting for the new version of SpatialData. It should arrive soon, hopefully this week

KunHHE · 2024-12-23T00:41:38Z

Hi, @quentinblampey, is this a CLI-specific issue? If we switch to snakemake or API, is there no issue at all? Thank you so much! appy holidays!

quentinblampey · 2025-01-02T13:33:45Z

Hi @KunHHE, no it's not specific to the CLI, you'll also have the error using the API/pipeline.
Sorry for the delay regarding the release of sopa 2, I hope the new version of SpatialData will soon be released...

quentinblampey · 2025-01-20T17:50:48Z

Sorry for the delay!
I’m happy to announce that sopa==2.0.0 is now released :)
Don’t hesitate to check the new documentation, or the migration guide to smoothly get up to date!

KunHHE · 2025-01-20T20:05:08Z

Wonderful @quentinblampey, I will test it right away!
When run cellpose using CLI mode: I run

set SOPA_PARALLELIZATION_BACKEND=dask
set SOPA_DASK_CLIENT_N_WORKERS=6
sopa segmentation cellpose C:/Users/hekun/Downloads/Slide1316.zarr --diameter 60 --channels Cellbound2 --channels DAPI --flow-threshold 1.5 --cellprob-threshold -5.5 --pretrained-model C:/Users/hekun/.cellpose/models/CP_20250110_104912DRGv2 --min-area 1000 --clip-limit 0.2 --gaussian-sigma 1

It still say "[INFO] (sopa._settings) Using dask backend
[WARNING] (sopa._settings) Each worker has less than 4GB of RAM (2.86GB), which may not be enough. Consider setting sopa.settings.dask_client_kwargs['n_workers'] to use less workers (11 currently)."

It dies:

KilledWorker: Attempted to run task 'write_patch_cells-a14416f5-18b3-4cd0-a3e8-684b9ae65f5a' on 4 different workers, but
all those workers died while running it. The last worker that attempt to run the task was tcp://127.0.0.1:53321.
Inspecting worker logs is often a good next step to diagnose what went wrong. For more information see
https://distributed.dask.org/en/stable/killed.html.

KunHHE · 2025-01-20T21:42:53Z

Update: I create a .py file running in the CLI:
set SOPA_PARALLELIZATION_BACKEND=dask
python C:/Users/hekun/segmentation.py

The segmentation.py file is:

import sopa

sopa.settings.dask_client_kwargs['n_workers'] = 4
sopa.settings.dask_client_kwargs['memory_limit'] = '16GB'
sopa.segmentation.cellpose(
"C:/Users/hekun/Downloads/Slide1316.zarr",
diameter=60,
channels=["Cellbound2", "DAPI"],
flow_threshold=1.5,
cellprob_threshold=-5.5,
pretrained_model="C:/Users/hekun/.cellpose/models/CP_20250110_104912DRGv2",
min_area=1000,
clip_limit=0.2,
gaussian_sigma=1
)

Error:

(sopa) C:\Users\hekun>python C:/Users/hekun/segmentation.py
C:\Users\hekun\miniconda3\envs\sopa\lib\site-packages\dask\dataframe_init_.py:31: FutureWarning: The legacy Dask DataFrame implementation is deprecated and will be removed in a future version. Set the configuration option dataframe.query-planning to True or None to enable the new Dask Dataframe implementation and silence this warning.
warnings.warn(
Traceback (most recent call last):
File "C:\Users\hekun\segmentation.py", line 8, in
sopa.segmentation.cellpose(
File "C:\Users\hekun\miniconda3\envs\sopa\lib\site-packages\sopa\segmentation\methods_cellpose.py", line 71, in cellpose
custom_staining_based(
File "C:\Users\hekun\miniconda3\envs\sopa\lib\site-packages\sopa\segmentation\methods_custom.py", line 41, in custom_staining_based
temp_dir = get_cache_dir(sdata) / cache_dir_name
File "C:\Users\hekun\miniconda3\envs\sopa\lib\site-packages\sopa\utils\utils.py", line 293, in get_cache_dir
if sdata.is_backed(): # inside the zarr directory
AttributeError: 'str' object has no attribute 'is_backed'

quentinblampey · 2025-01-21T07:50:07Z

Hi @KunHHE,

For the first example, in simply means that you have too many workers with too few memory. You can use a different machine, or, as you tried, use a different number of workers, as described below. For the second, you actually try to use the API, whose usage is described in this tutorial. In particular, the API uses directly the SpatialData object as an input, not paths. So, the command sopa.segmentation.cellpose( "C:/Users/hekun/Downloads/Slide1316.zarr", ...) is wrong and should be sopa.segmentation.cellpose(sdata, ...) (I let you read the above tutorial to know more).

So, finally, it should look like so:

import sopa

sopa.settings.parallelization_backend = "dask"
sopa.settings.dask_client_kwargs["n_workers"] = 4

sopa.segmentation.cellpose(sdata, ...) # add the other arguments

... # continue using the API

KunHHE · 2025-01-21T14:54:07Z

Hi @quentinblampey, so "sopa.settings.parallelization_backend = "dask"; sopa.settings.dask_client_kwargs["n_workers"] = 4" cannot be done in the CLI, so API only? I want to use CLI only I think....

Can I ask you a stupid question, sorry, when saying "API usage", we can run sopa "API" using jupyter notebook?

Thanks!

KunHHE · 2025-01-21T15:38:01Z

OK, Assuming the API can be run using jupyter notebook, so I tried and looks like it worked out:

quentinblampey · 2025-01-21T16:03:35Z

Yes, using the API means you can use a Jupyter Notebook, among others.

so "sopa.settings.parallelization_backend = "dask"; sopa.settings.dask_client_kwargs["n_workers"] = 4" cannot be done in the CLI, so API only?

It can be done using the CLI, but the command is different. For the CLI, you need to set an environment variable, as you did above. The thing is, this only sets dask, but you can't (yet) choose the number of workers via the CLI. I can add this, but, meanwhile, prefer using the API or you can also use a machine with more RAM per worker

KunHHE · 2025-01-21T16:10:21Z

Thanks very much! @quentinblampey, sorry too many questions. I have to run Baysor using cellpose as prior, so want to double-check, I am using API now, do we still need the .toml for Baysor config?

OR we just need te default setting using:
sopa.make_transcript_patches(sdata, patch_width=1000, prior_shapes_key="cellpose_boundaries")
sopa.segmentation.baysor(sdata, min_area=20)
sopa.aggregate(sdata)

For "resolve", both Cellpose and Baysor running in the sopa, it will always automatically run resolve NOW?

quentinblampey · 2025-01-21T16:24:31Z

You can provide a baysor config, as detailed in the tutorial, but you don't have to. If you don't provide a baysor config, it will be inferred. But if you already have a good config, it's easier to use.

Yes, you don't need to run "resolve" using the API.

KunHHE · 2025-01-21T17:03:49Z

Cool! If I run sopa.segmentation.tissue(sdata) prior to "sopa.make_image_patches(sdata, patch_width=6000, patch_overlap=150)" and "sopa.segmentation.cellpose", will te patches and segmentation run the "region_of_interest" region only?

quentinblampey · 2025-01-21T17:17:21Z

Yes, as described in the tutorial, it will only run inside the segmented tissue

KunHHE · 2025-01-21T22:19:09Z

Thanks so much @quentinblampey, I was stopped by Baysor running. I can run it before using CLI. Now I am using the API.

sopa.segmentation.baysor(sdata, min_area=20), error is "FileNotFoundError: Please install baysor and ensure that either C:\Users\hekun\.julia\bin\baysor executes baysor, or baysor is an existing shell alias for baysor's executable."
I set up "Environment Variable" window and add C:\Users\hekun.julia\bin\ to the path. Then I opened PowerShell: Baysor and it recognized it. ---"PS C:\Users\hekun> baysor"---" baysor v0.7.1".

I freshened the notebook and 'env', but still the sopa cannot find the baysor's executable.
Could you please provide any guidance?

Thanks!

KunHHE · 2025-01-21T22:43:18Z

Updates: figure it out using
import os
os.environ["PATH"] += os.pathsep + r"C:\Users\hekun.julia\bin"

But I had issues:

on some patches <4000 transcripts are out for segmentation, so I did sopa.make_transcript_patches(
sdata,
patch_width=1000,
patch_overlap=20,
prior_shapes_key="cellpose_boundaries",
min_points_per_patch=0
), then this way to force it to run all patches? is this correct?

Then run sopa.segmentation.baysor(sdata, min_area=0)

AssertionError: Could not find the segmentation polygons file in C:\Users\hekun\Downloads\Slide1307.zarr.sopa_cache\transcript_patches\0

I can see patches in the folder of.zarr.sopa_cache\transcript_patches, but each folder of patches only have config.toml and transcript.csv, I did not see segmentation polygons file...........

[INFO] (sopa.segmentation.methods._baysor) The Baysor config was not provided, using the following by default:
{'data': {'x': 'x', 'y': 'y', 'gene': 'gene', 'min_molecules_per_gene': 10, 'min_molecules_per_cell': 20, 'force_2d': True}, 'segmentation': {'prior_segmentation_confidence': 0.8}}
[WARNING] (sopa._settings) Running without parallelization backend can be slow. Consider using a backend, e.g. via sopa.settings.parallelization_backend = 'dask', or export SOPA_PARALLELIZATION_BACKEND=dask.

0%| | 0/9 [00:00<?, ?it/s]
100%|████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:00<00:00, 65.55it/s]

Reading transcript-segmentation outputs: 0%| | 0/9 [00:00<?, ?it/s]

AssertionError Traceback (most recent call last)
Cell In[34], line 1
----> 1 sopa.segmentation.baysor(sdata, min_area=0)

File ~\miniconda3\envs\sopa\lib\site-packages\sopa\segmentation\methods_baysor.py:79, in baysor(sdata, config, min_area, delete_cache, recover, force, scale, key_added, patch_index)
76 assert patches_dirs, "Baysor failed on all patches"
78 gene_column = _get_gene_column_argument(config)
---> 79 resolve(sdata, patches_dirs, gene_column, min_area=min_area, key_added=key_added)
81 sdata.attrs[SopaAttrs.BOUNDARIES] = key_added
83 if delete_cache:

File ~\miniconda3\envs\sopa\lib\site-packages\sopa\segmentation_transcripts.py:43, in resolve(sdata, patches_dirs, gene_column, min_area, key_added)
40 if min_area > 0:
41 log.info(f"Cells whose area is less than {min_area} microns^2 will be removed")
---> 43 patches_cells, adatas = _read_all_segmented_patches(patches_dirs, min_area)
44 geo_df, cells_indices, new_ids = _resolve_patches(patches_cells, adatas)
46 points_key = sdata[SopaKeys.TRANSCRIPTS_PATCHES][SopaKeys.POINTS_KEY].iloc[0]

File ~\miniconda3\envs\sopa\lib\site-packages\sopa\segmentation_transcripts.py:142, in _read_all_segmented_patches(patches_dirs, min_area)
138 def _read_all_segmented_patches(
139 patches_dirs: list[str],
140 min_area: float = 0,
141 ) -> tuple[list[list[Polygon]], list[AnnData]]:
--> 142 outs = [
143 _read_one_segmented_patch(path, min_area)
144 for path in tqdm(patches_dirs, desc="Reading transcript-segmentation outputs")
145 ]
147 patches_cells, adatas = zip(*outs)
149 return patches_cells, adatas

File ~\miniconda3\envs\sopa\lib\site-packages\sopa\segmentation_transcripts.py:143, in (.0)
138 def _read_all_segmented_patches(
139 patches_dirs: list[str],
140 min_area: float = 0,
141 ) -> tuple[list[list[Polygon]], list[AnnData]]:
142 outs = [
--> 143 _read_one_segmented_patch(path, min_area)
144 for path in tqdm(patches_dirs, desc="Reading transcript-segmentation outputs")
145 ]
147 patches_cells, adatas = zip(*outs)
149 return patches_cells, adatas

File ~\miniconda3\envs\sopa\lib\site-packages\sopa\segmentation_transcripts.py:93, in _read_one_segmented_patch(directory, min_area, min_vertices)
89 def _read_one_segmented_patch(
90 directory: str, min_area: float = 0, min_vertices: int = 4
91 ) -> tuple[list[Polygon], AnnData]:
92 directory: Path = Path(directory)
---> 93 id_as_string, polygon_file = _find_polygon_file(directory)
95 loom_file = directory / "segmentation_counts.loom"
96 if loom_file.exists():

File ~\miniconda3\envs\sopa\lib\site-packages\sopa\segmentation_transcripts.py:134, in _find_polygon_file(directory)
132 return False, old_baysor_path
133 new_baysor_path = directory / "segmentation_polygons_2d.json"
--> 134 assert new_baysor_path.exists(), f"Could not find the segmentation polygons file in {directory}"
135 return True, new_baysor_path

AssertionError: Could not find the segmentation polygons file in C:\Users\hekun\Downloads\Slide1307.zarr.sopa_cache\transcript_patches\0

quentinblampey · 2025-01-22T10:03:25Z

You can use min_points_per_patch=0, which will make Baysor run even on patches with a low amount of transcripts
But the way to force baysor running is by using sopa.segmentation.baysor(sdata, min_area=0, force=True) (see the force argument)

Please, next time, can you try to send a way for me to reproduce your issue? I.e., using the toy dataset for instance.
Also, the documentation should answer already most of your questions!

KunHHE · 2025-01-22T14:42:16Z

Hi @quentinblampey Thanks for suggestions, I did use Toy data for testing for Baysor, and it phenocopys the issue that there's no segmentation polygons file in the .sopa_cache\transcript_patches. I share the notebook for your reference via OneDrive link, hope it works!

https://esbc22-my.sharepoint.com/:u:/g/personal/kun_he_omapix_com/EaRLYnQhzoVJlEATvb_xI5MBM5bdwE8QFMEAvmbYTXabxQ?e=AsP4xr

quentinblampey · 2025-01-22T14:50:42Z

The notebook ran successfully for me.
Are you sure you installed baysor correctly? Can you try to run it on the patch directory without sopa (i.e., run directly baysor itself)

KunHHE · 2025-01-22T19:52:07Z

Yeah, I re-installed the BAYSOR dependencies. using CLI:
installe Julia and GCC;
juliaup add 1.10;
juliaup default 1.10;
julia -e "using Pkg; Pkg.add(PackageSpec(url="https://github.com/kharchenkolab/Baysor.git")); Pkg.build()"

Then in the CLI I test: 'baysor segfree -c C:/Users/hekun/sopa/workflow/config/merscope/merscope.toml C:/Users/hekun/Downloads/Slide1307/detected_transcripts.csv', looks it is working in the baysor itself?
But interesting and weird thing is when I used: (sopa)C:\Users\hekun>baysor -v; I got julia version 1.10.7..... This is not related to sopa but want to have a update.

quentinblampey · 2025-01-23T08:49:23Z

Thanks for trying this! And do you have the results of Baysor somewhere?
Other question: when running baysor with Sopa, does it look like something is running, or is it super quick until the error?

KunHHE · 2025-01-23T12:45:23Z

Hi @quentinblampey. MY ALL the process is under windows..

Run baysor run -c C:/Users/hekun/sopa/workflow/config/merscope/merscope.toml C:/Users/hekun/Downloads/Slide1307.zarr/.sopa_cache/transcript_patches/0/transcripts.csv -o C:/Users/hekun/Downloads/Slide1307

I got outputs: https://esbc22-my.sharepoint.com/:u:/g/personal/kun_he_omapix_com/EUkDYiuT_LRPtcoAjI50XVUBU8M87fef_-sc7EnNimW4oA?e=ZgV4iy

I did sopa[baysor] install, then installed julia 1.10 and open julia;
Then: using Pkg
Pkg.add(PackageSpec(url="https://github.com/kharchenkolab/Baysor.git"))
Pkg.build()
Then I go to sopa env using miniconda3 prompt:
pip install julia;
import julia
julia.install();

I did not see error during those installation process.

Then tried to go back to sopa notebook using Toy data, run baysor, it runs the first stage but failed for patches call:

sopa.segmentation.baysor(sdata,
config=None,
min_area=0,
delete_cache=True,
recover=False,
force=True,
scale=None,
key_added='BAYSOR_BOUNDARIES',
patch_index=None)
--------------detailed track:
[INFO] (sopa.segmentation.methods._baysor) The Baysor config was not provided, using the following by default:
{'data': {'x': 'x', 'y': 'y', 'gene': 'genes', 'min_molecules_per_gene': 10, 'min_molecules_per_cell': 20, 'force_2d': True}, 'segmentation': {'prior_segmentation_confidence': 0.8}}

100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 50.98it/s]

AssertionError Traceback (most recent call last)
Cell In[25], line 1
----> 1 sopa.segmentation.baysor(sdata,
2 config=None,
3 min_area=0,
4 delete_cache=True,
5 recover=False,
6 force=True,
7 scale=None,
8 key_added='BAYSOR_BOUNDARIES',
9 patch_index=None)

File ~\miniconda3\envs\sopa\lib\site-packages\sopa\segmentation\methods_baysor.py:76, in baysor(sdata, config, min_area, delete_cache, recover, force, scale, key_added, patch_index)
74 if force:
75 patches_dirs = [patch_dir for patch_dir in patches_dirs if (patch_dir / "segmentation_counts.loom").exists()]
---> 76 assert patches_dirs, "Baysor failed on all patches"
78 gene_column = _get_gene_column_argument(config)
79 resolve(sdata, patches_dirs, gene_column, min_area=min_area, key_added=key_added)

AssertionError: Baysor failed on all patches

KunHHE · 2025-01-23T17:44:07Z

Updates: I run the baysor patch-by-patch one by one manually in the CLI successfully using:
cd C:/Users/hekun/Downloads/Slide1307.zarr/.sopa_cache/transcript_patches/8;
Then: C:/Users/hekun/.julia/bin/baysor run --polygon-format FeatureCollection -c /Users/hekun/sopa/workflow/config/merscope/merscope.toml transcripts.csv.

In the API, after run :sopa.make_transcript_patches(sdata,
patch_width=500,
patch_overlap=20,
points_key=None,
prior_shapes_key="cellpose_boundaries",
unassigned_value=None,
min_points_per_patch=0,
write_cells_centroids=False,
key_added=None), it generate transcript_patches in the .sopa_cache folder,
Then we run: sopa.segmentation.baysor(sdata,
config=None,
min_area=0,
delete_cache=True,
recover=False,
force=True,
key_added='baysor_boundaries',
patch_index=None
), then looks like the program is looking segmentation polygons file (segmentation_polygons_2d?), but the segmentation_polygons_2d file for each patch is the outputs?

KunHHE · 2025-01-25T12:38:28Z

Hi @quentinblampey, I tested using ubuntu, and it worked out. looks like there are some issue running in windows?

KunHHE closed this as completed Dec 3, 2024

KunHHE reopened this Dec 9, 2024

quentinblampey closed this as completed Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[help wanted] FileNotFoundError: [Errno 2] No such file or directory for --sopa resolve baysor #161

[help wanted] FileNotFoundError: [Errno 2] No such file or directory for --sopa resolve baysor #161

KunHHE commented Dec 3, 2024

quentinblampey commented Dec 3, 2024

KunHHE commented Dec 9, 2024

quentinblampey commented Dec 9, 2024

KunHHE commented Dec 23, 2024

quentinblampey commented Jan 2, 2025

quentinblampey commented Jan 20, 2025

KunHHE commented Jan 20, 2025 •

edited

Loading

KunHHE commented Jan 20, 2025

quentinblampey commented Jan 21, 2025

KunHHE commented Jan 21, 2025 •

edited

Loading

KunHHE commented Jan 21, 2025

quentinblampey commented Jan 21, 2025

KunHHE commented Jan 21, 2025

quentinblampey commented Jan 21, 2025

KunHHE commented Jan 21, 2025

quentinblampey commented Jan 21, 2025

KunHHE commented Jan 21, 2025

KunHHE commented Jan 21, 2025 •

edited

Loading

quentinblampey commented Jan 22, 2025 •

edited

Loading

KunHHE commented Jan 22, 2025

quentinblampey commented Jan 22, 2025

KunHHE commented Jan 22, 2025 •

edited

Loading

quentinblampey commented Jan 23, 2025

KunHHE commented Jan 23, 2025 •

edited

Loading

KunHHE commented Jan 23, 2025 •

edited

Loading

KunHHE commented Jan 25, 2025

[help wanted] FileNotFoundError: [Errno 2] No such file or directory for --sopa resolve baysor #161

[help wanted] FileNotFoundError: [Errno 2] No such file or directory for --sopa resolve baysor #161

Comments

KunHHE commented Dec 3, 2024

quentinblampey commented Dec 3, 2024

KunHHE commented Dec 9, 2024

quentinblampey commented Dec 9, 2024

KunHHE commented Dec 23, 2024

quentinblampey commented Jan 2, 2025

quentinblampey commented Jan 20, 2025

KunHHE commented Jan 20, 2025 • edited Loading

KunHHE commented Jan 20, 2025

quentinblampey commented Jan 21, 2025

KunHHE commented Jan 21, 2025 • edited Loading

KunHHE commented Jan 21, 2025

quentinblampey commented Jan 21, 2025

KunHHE commented Jan 21, 2025

quentinblampey commented Jan 21, 2025

KunHHE commented Jan 21, 2025

quentinblampey commented Jan 21, 2025

KunHHE commented Jan 21, 2025

KunHHE commented Jan 21, 2025 • edited Loading

Reading transcript-segmentation outputs: 0%| | 0/9 [00:00<?, ?it/s]

quentinblampey commented Jan 22, 2025 • edited Loading

KunHHE commented Jan 22, 2025

quentinblampey commented Jan 22, 2025

KunHHE commented Jan 22, 2025 • edited Loading

quentinblampey commented Jan 23, 2025

KunHHE commented Jan 23, 2025 • edited Loading

100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 50.98it/s]

KunHHE commented Jan 23, 2025 • edited Loading

KunHHE commented Jan 25, 2025

KunHHE commented Jan 20, 2025 •

edited

Loading

KunHHE commented Jan 21, 2025 •

edited

Loading

KunHHE commented Jan 21, 2025 •

edited

Loading

quentinblampey commented Jan 22, 2025 •

edited

Loading

KunHHE commented Jan 22, 2025 •

edited

Loading

KunHHE commented Jan 23, 2025 •

edited

Loading

KunHHE commented Jan 23, 2025 •

edited

Loading