Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comseg cli #76

Merged
merged 31 commits into from
Jul 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 74 additions & 4 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -347,12 +347,13 @@ $ sopa patchify [OPTIONS] COMMAND [ARGS]...

**Commands**:

* `baysor`: Prepare the patches for Baysor segmentation
* `baysor`: Prepare patches for transcript-based...
* `comseg`: Prepare patches for transcript-based...
* `image`: Prepare patches for staining-based...

#### `sopa patchify baysor`

Prepare the patches for Baysor segmentation
Prepare patches for transcript-based segmentation with Baysor

**Usage**:

Expand All @@ -370,12 +371,37 @@ $ sopa patchify baysor [OPTIONS] SDATA_PATH
* `--patch-overlap-microns FLOAT`: Number of overlapping microns between the patches. We advise to choose approximately twice the diameter of a cell [required]
* `--baysor-temp-dir TEXT`: Temporary directory where baysor inputs and outputs will be saved. By default, uses `.sopa_cache/baysor_boundaries`
* `--config-path TEXT`: Path to the baysor config (you can also directly provide the argument via the `config` option)
* `--config TEXT`: Dictionnary of baysor parameters [default: {}]
* `--cell-key TEXT`: Optional column of the transcripts dataframe that indicates in which cell-id each transcript is, in order to use prior segmentation
* `--config TEXT`: Dictionnary of baysor parameters, overwrite the config_path argument if provided [default: {}]
* `--cell-key TEXT`: Optional column of the transcripts dataframe that indicates in which cell-id each transcript is, in order to use prior segmentation Default is 'cell' if cell_key=None
* `--unassigned-value INTEGER`: If --cell-key is provided, this is the value given to transcripts that are not inside any cell (if it's already 0, don't provide this argument)
* `--use-prior / --no-use-prior`: Whether to use cellpose segmentation as a prior for baysor (if True, make sure to first run Cellpose) [default: no-use-prior]
* `--help`: Show this message and exit.

#### `sopa patchify comseg`

Prepare patches for transcript-based segmentation with ComSeg

**Usage**:

```console
$ sopa patchify comseg [OPTIONS] SDATA_PATH
```

**Arguments**:

* `SDATA_PATH`: Path to the SpatialData `.zarr` directory [required]

**Options**:

* `--patch-width-microns FLOAT`: Width (and height) of each patch in microns [required]
* `--patch-overlap-microns FLOAT`: Number of overlapping microns between the patches. We advise to choose approximately twice the diameter of a cell [required]
* `--comseg-temp-dir TEXT`: Temporary directory where baysor inputs and outputs will be saved. By default, uses `.sopa_cache/comseg_boundaries`
* `--config-path TEXT`: Path to the ComSeg json config file (you can also directly provide the argument via the `config` option)
* `--config TEXT`: Dictionnary of ComSeg parameters, overwrite the config_path argument if provided [default: {}]
* `--cell-key TEXT`: Optional column of the transcripts dataframe that indicates in which cell-id each transcript is, in order to use prior segmentation. Default is cell if cell_key=None
* `--unassigned-value INTEGER`: If --cell-key is provided, this is the value given to transcripts that are not inside any cell (if it's already 0, don't provide this argument)
* `--help`: Show this message and exit.

#### `sopa patchify image`

Prepare patches for staining-based segmentation (including Cellpose)
Expand Down Expand Up @@ -457,6 +483,7 @@ $ sopa resolve [OPTIONS] COMMAND [ARGS]...

* `baysor`: Resolve patches conflicts after baysor...
* `cellpose`: Resolve patches conflicts after cellpose...
* `comseg`: Resolve patches conflicts after comseg...
* `generic`: Resolve patches conflicts after generic...

#### `sopa resolve baysor`
Expand Down Expand Up @@ -500,6 +527,28 @@ $ sopa resolve cellpose [OPTIONS] SDATA_PATH
* `--patch-dir TEXT`: Directory containing the cellpose segmentation on patches (or multiple directories if using multi-step segmentation). By default, uses the `.sopa_cache/cellpose_boundaries` directory
* `--help`: Show this message and exit.

#### `sopa resolve comseg`

Resolve patches conflicts after comseg segmentation. Provide either `--comseg-temp-dir` or `--patches-dirs`

**Usage**:

```console
$ sopa resolve comseg [OPTIONS] SDATA_PATH
```

**Arguments**:

* `SDATA_PATH`: Path to the SpatialData `.zarr` directory [required]

**Options**:

* `--gene-column TEXT`: Column of the transcripts dataframe containing the genes names [required]
* `--comseg-temp-dir TEXT`: Path to the directory containing all the comseg patches (see `sopa patchify`). By default, uses the `.sopa_cache/comseg_boundaries` directory
* `--min-area FLOAT`: Cells with an area less than this value (in microns^2) will be filtered [default: 0]
* `--patches-dirs TEXT`: List of patches directories inside `comseg_temp_dir`
* `--help`: Show this message and exit.

#### `sopa resolve generic`

Resolve patches conflicts after generic segmentation
Expand Down Expand Up @@ -537,6 +586,7 @@ $ sopa segmentation [OPTIONS] COMMAND [ARGS]...
**Commands**:

* `cellpose`: Perform cellpose segmentation.
* `comseg`: Perform ComSeg segmentation.
* `generic-staining`: Perform generic staining-based segmentation.

#### `sopa segmentation cellpose`
Expand Down Expand Up @@ -575,6 +625,26 @@ $ sopa segmentation cellpose [OPTIONS] SDATA_PATH
* `--method-kwargs TEXT`: Kwargs for the cellpose method builder. This should be a dictionnary, in inline string format. [default: {}]
* `--help`: Show this message and exit.

#### `sopa segmentation comseg`

Perform ComSeg segmentation. This can be done on all patches directly, or on one individual patch.

**Usage**:

```console
$ sopa segmentation comseg [OPTIONS] SDATA_PATH
```

**Arguments**:

* `SDATA_PATH`: Path to the SpatialData `.zarr` directory [required]

**Options**:

* `--patch-index INTEGER`: Index of the patch on which the segmentation method should be run.`
* `--patch-dir TEXT`: Path to the temporary the segmentation method directory inside which we will store each individual patch segmentation. By default, saves into the `.sopa_cache/comseg` directory
* `--help`: Show this message and exit.

#### `sopa segmentation generic-staining`

Perform generic staining-based segmentation. This can be done on all patches directly, or on one individual patch.
Expand Down
76 changes: 76 additions & 0 deletions docs/tutorials/cli_other_segmentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@

### Option 3: ComSeg
quentinblampey marked this conversation as resolved.
Show resolved Hide resolved


[ComSeg](https://github.com/fish-quant/ComSeg) is a transcript-based segmentation method. It uses a segmentation prior (here, Cellpose) and improves it using the transcripts information.

#### Run Cellpose to segment nuclei

```
sopa patchify image tuto.zarr --patch-width-pixel 1500 --patch-overlap-pixel 50
sopa segmentation cellpose tuto.zarr --channels DAPI --diameter 35 --min-area 2000
sopa resolve cellpose tuto.zarr
```

#### Save a ComSeg config file as config.jsons
More information on the parameters can be found in the [ComSeg documentation](https://comseg.readthedocs.io/en/latest/userguide/Minimal_example.html).
Below we display a minimal example of a ComSeg config file.


```json
{"dict_scale": {"x": 1, "y": 1, "z": 1},
"mean_cell_diameter": 15,
"max_cell_radius": 50,
"alpha": 0.5,
"min_rna_per_cell": 5,
"gene_column": "genes"}
```

#### Run ComSeg with the sopa command line tool

1) create the ComSeg patches
On the toy dataset, we will generate 4 patches.
```
sopa patchify comseg tuto.zarr --config-path config.json --patch-width-microns 200 --patch-overlap-microns 50
```

2) run ComSeg on all patches

!!! tip
Manually running the commands below can involve using many consecutive commands, so we recommend automatizing it. For instance, this can be done using Snakemake or Nextflow. This will help you parallelize it since you can run each task on separate jobs or using multithreading. You can also see how we do it in the [Sopa Snakemake pipeline](https://github.com/gustaveroussy/sopa/blob/master/workflow/Snakefile).

To automatically get the number of patches, you can open the `tuto.zarr/.sopa_cache/patches_file_comseg` file. This lists the names of the directories inside `tuto.zarr/.sopa_cache/comseg` related to each patch. If you selected an ROI, the excluded patches are effectively not in the `patches_file_comseg` file.

=== "Patch 0"
```sh
cd tuto.zarr/.sopa_cache/comseg_boundaries/0

# 'comseg' is the official comseg executable. If unavailable, replace it with your path to the executable
comseg run --save-polygons GeoJSON -c config.toml transcripts.csv
```
=== "Patch 1"
```sh
cd tuto.zarr/.sopa_cache/comseg_boundaries/1

# 'comseg' is the official comseg executable. If unavailable, replace it with your path to the executable
comseg run --save-polygons GeoJSON -c config.toml transcripts.csv
```
=== "Patch 2"
```sh
cd tuto.zarr/.sopa_cache/comseg_boundaries/2

# 'comseg' is the official comseg executable. If unavailable, replace it with your path to the executable
comseg run --save-polygons GeoJSON -c config.toml transcripts.csv
```
=== "Patch 3"
```sh
cd tuto.zarr/.sopa_cache/comseg_boundaries/3

# 'comseg' is the official comseg executable. If unavailable, replace it with your path to the executable
comseg run --save-polygons GeoJSON -c config.toml transcripts.csv
```

3) Merge the results
```sh
sopa resolve comseg tuto.zarr --gene-column genes
```
Loading
Loading