Skip to content

Commit

Permalink
Merge branch 'christinab12-patch-1' of https://github.com/HelmholtzAI…
Browse files Browse the repository at this point in the history
  • Loading branch information
Christina Bukas committed Feb 26, 2024
2 parents d63da03 + 45c510a commit 4c65dd1
Show file tree
Hide file tree
Showing 4 changed files with 9 additions and 140 deletions.
6 changes: 4 additions & 2 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,10 @@ sphinx:

# Optionally declare the Python requirements required to build your docs
python:
# Install our python package before building the docs
# Install both Python packages before building the docs
install:
- method: pip
path: .
path: src/client
- method: pip
path: src/server
- requirements: docs/requirements.txt
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ To run the client GUI follow the instructions described in [DCP Client Installat
DCP handles all kinds of **segmentation tasks**! Try it out if you need to do:
* **Instance** segmentation
* **Semantic** segmentation
* **Panoptic** segmentation
* **Multi-class instance** segmentation

### Toy data
This repo includes the ```data/``` directory with some toy data which you can use as the *Uncurated dataset* folder. You can create (empty) folders for the other two directories required in the welcome window and start playing around.
Expand Down
81 changes: 2 additions & 79 deletions src/client/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,82 +9,5 @@ Before starting, make sure you have navigated to ```data-centric-platform/src/cl
```
pip install -e .
```

### Running the client: A step to step guide!
1. **Configurations**

Before launching the GUI you will need to set up your client configuration file, _dcp_client/config.cfg_. Please, obey the [formal JSON format](https://www.json.org/json-en.html). Here, we will define how the client will interact with the server. There are currently two options available: running the server locally, or connecting to the running instance on the FZJ jusuf-cloud. To connect to a locally running server, set:
```
"server":{
"user": "local",
"host": "local",
"data-path": "None",
"ip": "localhost",
"port": 7010
}
```
To connect to the running service on jusuf-cloud, set:
```
"server":{
"user": "rocky",
"host": "jsc-vm",
"data-path": "/home/rocky/dcp-data/my-project",
"ip": "134.94.198.230",
"port": 7010
}
```
Before continuing, you need to make sure that DCP server is running, either locally or on the cloud. See [DCP Server Installation & Launch](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/main/src/server/README.md#using-pypi) for instructions on how to launch the server. **Note:** In order for this connection to succeed, you will need to have contacted the team developing DCP, so they can add your IP to the list of accepted requests.

To make it easier for you we provide you with two config files, one works when running a local server and one for remote - just make sure you rename the config file you wish to use to ```config.cfg```. The defualt is local configuration.


2. **Launching the client**

After setting your config simply run:
```
python dcp_client/main.py
```

3. **Welcome window**

The welcome window should have now popped up.

<img src="https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/main/src/client/readme_figs/client_welcome_window.png" width="400" height="200">

Here you will need to select the directories which we will be using throughout the data centric workflow. The following directories need to be defined:

* **Uncurated dataset path:** This folder is intended to store all images of your dataset. These images may be accompanied by corresponding segmentations. If present, segmentation files should share the same filename as their associated image, appended with a suffix as specified in 'setup/seg_name_string', defined in ```server/dcp_server/config.cfg``` (default: '_seg').
* **Curation in progress path:(Optional)** Images for which the segmentation is a work in progress should be moved here. Each image in this folder can have one or multiple segmentations corresponding to it (by changing the filename of the segmentation in the napari layer list after editing it, see **Viewer**). If you do not want to use an intermediate working dir, you can skip setting a path to this directory (it is not required). No future functions affect this directory, it is only used to move to and from the uncurated and curated directories.
* **Curated dataset path:** This folder is intended to contain images along with their final segmentations. **Only** move images here when the segmentation is complete and finalised, you won't be able to change them after they have been moved here. These are then used for training your model.

4. **Setting paths**

After setting the paths for these three folders, you can click the **Start** button. If you have set the server configuration to the cloud, you will receive a message notifying you that your data will be uploaded to the cloud. Clik **Ok** to continue.

5. **Data Overview**

The main working window will appear next. This gives you an overview of the directories selected in the previous step along with three options:

* **Generate Labels:** Click this button to generate labels for all images in the "Uncurated dataset" directory. This will call the ```segment_image``` service from the server
* **View image and fix label:** Click this button to launch your viewer. The napari software is used for visualising, and editing the images segmentations. See **Viewer**
* **Train Model:** Click this model to train your model on the images in the "Curated dataset" directory. This will call the ```train``` service from the server
![Alt Text](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/main/src/client/readme_figs/client_data_overview_window.png)

6. **The viewer**

In DCP, we use [napari](https://napari.org/stable) for viewing our images and masks, adding, editing or removing labels. An example of the viewer can be seen below. After adding or removing any objects and editing existing objects wherever necessary, there are two options available:
- Click the **Move to Curation in progress folder** if you are not 100% certain about the labels you have created. You can also click on the label in the labels layer and change the name. This will result in several label files being created in the *In progress folder*, which can be examined later on. **Note:** When changing the layer name in Napari, the user should rename it such that they add their initials or any other new info after _seg. E.g., if the labels of 1_seg.tiff have been changed in the Napari viewer, then the appropriate naming would for example be: 1_seg_CB.tiff and not 1_CB_seg.tiff.
- Click the **Move to Curated dataset folder** if you are certain that the labels you are now viewing are final and require no more curation. These images and labels will later be used for training the machine learning model, so make sure that you select this option only if you are certain about the labels. If several labels are displayed (opened from the 'Curation in progress' step), make sure to **click** on the single label in the labels layer list you wish to be moved to the *Curated data folder*. The other images will then be automatically deleted from this folder.

![Alt Text](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/main/src/client/readme_figs/client_napari_viewer.png)

### Data centric workflow [intended usage summary]
The intended usage of DCP would include the following:
1. Setting up configuration, run client (with server already running) and select data directories
2. Generate labels for data in *Uncurated data folder*
3. Visualise the resulting labels with the viewer and correct labels wherever necessary - once done move the image *Curated data folder*. Repeat this step for a couple of images until a few are placed into the *Curated data folder*. Depending on the qualitative evaluation of the label generation you might want to include fewer or more images, i.e. if the resulting masks require few edits, then few images will most likely be sufficient, whereas if many edits to the mask are required it is likely that more images are needed in the *Curated data folder*. You can always start with a small number and adjust later
4. Train the model with the images in the *Curated data folder*
6. Repeat steps 2-4 until you are satisfied with the masks generated for the remaining images in the *Uncurated data folder*. Every time the model is trained in step 4, the masks generated in step 2 should be of higher quality, until the model need not be trained any more
<img src="https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/main/src/client/readme_figs/dcp_pipeline.png" width="200" height="200">


## Want to know more?
Visit our [documentation](https://readthedocs.org/projects/data-centric-platform) for more information.
60 changes: 2 additions & 58 deletions src/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,61 +21,5 @@ python dcp_server/main.py
```
Once the server is running, you can verify it is working by visiting http://localhost:7010/ in your web browser.

## Customization (for developers)

All service configurations are set in the _config.cfg_ file. Please, obey the [formal JSON format](https://www.json.org/json-en.html).

The config file has to have the five main parts. All the ```marked``` arguments are mandatory:

- ``` setup ```
- ```segmentation ``` - segmentation type from the segmentationclasses.py. Currently, only **GeneralSegmentation** is available (MitoProjectSegmentation and GFPProjectSegmentation are stale).
- ```accepted_types``` - types of images currently accepted for the analysis
- ```seg_name_string``` - end string for masks to run on (All the segmentations of the image should contain this string - used to save and search for segmentations of the images)
- ```service```
- ```model_to_use``` - name of the model class from the models.py you want to use. Currently, available models are:
- **CustomCellposeModel**: Inherits [CellposeModel](https://cellpose.readthedocs.io/en/latest/api.html#cellposemodel) class
- **CellposePatchCNN**: Includes a segmentor and a clasifier. Currently segmentor can only be ```CustomCellposeModel```, and classifier is ```CellClassifierFCNN```. The model sequentially runs the segmentor and then classifier, on patches of the objects to classify them.
- ```save_model_path``` - name for the trained model which will be saved after calling the (re)train from service - is saved under ```bentoml/models```
- ```runner_name``` - name of the runner for the bentoml service
- ```service_name``` - name for the bentoml service
- ```port``` - on which port to start the service
- ```model``` - configuration for the model instatiation. Here, pass any arguments you need or want to change. Take care that the names of the arguments are the same as of original model class' _init()_ function!
- ```segmentor```: model configuration for the segmentor. Currently takes argumnets used in the init of CellposeModel, see [here](https://cellpose.readthedocs.io/en/latest/api.html#cellposemodel).
- ```classifier```: model configuration for classifier, see _init()_ of ```CellClassifierFCNN```
- ```train``` - configuration for the model training. Take care that the names of the arguments are the same as of original model's _train()_ function!
- ```segmentor```: If using cellpose - the _train()_ function arguments can be found [here](https://cellpose.readthedocs.io/en/latest/api.html#id7). Here, pass any arguments you need or want to change or leave empty {}, then default arguments will be used.
- ```classifier```: train configuration for classifier, see _train()_ of ```CellClassifierFCNN```
- ```eval``` - configuration for the model evaluation.. Take care that the names of the arguments are the same as of original model's _eval()_ function!
- ```segmentor```: If using cellpose - the _eval()_ function arguments can be found [here](https://cellpose.readthedocs.io/en/latest/api.html#id3). Here, pass any arguments you need or want to change or leave empty {}, then default arguments will be used.
- ```classifier```: train configuration for classifier, see _eval()_ of ```CellClassifierFCNN```.
- ```mask_channel_axis```: If a multi-class instance segmentation model has been used, then the masks returned by the model should have two channels, one for the instance segmentation results and one indicating the obects class. This variable indicated at which dim the channel axis should be stored. Currently should be kept at 0, as this is the only way the masks can be visualised correcly by napari in the client.

To make it easier for you we provide you with two config files: ```config.cfg``` is set up to work for a panoptic segmentation task, while ```config_instance.cfg``` for instance segmentation. Make sure to rename the config you wish to use to ```config.cfg```. The default is panoptic segmentation.

## Models
The current models are currently integrated into DCP:
* CellPose --> for instance segmentation tasks
* CellposePatchCNN --> for panoptic segmentation tasks: includes the Cellpose model for instance segmentation followed by a patch wise CNN model on the predicted instances for obtaining class labels

## Running with Docker [DO NOT USE UNTIL ISSUE IS SOLVED]

### Docker --> Currently doesn't work for generate labels?

#### Docker-Compose
```
docker compose up
```
#### Docker Non-Interactively
```
docker build -t dcp-server .
docker run -p 7010:7010 -it dcp-server
```

#### Docker Interactively
```
docker build -t dcp-server .
docker run -it dcp-server bash
bentoml serve service:svc --reload --port=7010
```


## Want to know more?
Visit our [documentation](https://readthedocs.org/projects/data-centric-platform) for more information.

0 comments on commit 4c65dd1

Please sign in to comment.