Skip to content

Commit

Permalink
Merge pull request #40 from HelmholtzAI-Consultants-Munich/documentation
Browse files Browse the repository at this point in the history
Documentation
  • Loading branch information
christinab12 authored Nov 30, 2023
2 parents 617a317 + 0130ba0 commit a9e0426
Show file tree
Hide file tree
Showing 8 changed files with 151 additions and 41 deletions.
28 changes: 28 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
BSD 3-Clause License

Copyright (c) 2023, Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH)

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30 changes: 4 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# data-centric-platform
A data centric platform for microscopy imaging

[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#https://github.com/HelmholtzAI-Consultants-Munich/active-learning-platform)
# Data Centric Platform
*A data centric platform for mutli-class segmentation in microscopy imaging*

![stability-wip](https://img.shields.io/badge/stability-work_in_progress-lightgrey.svg)
![tests](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/actions/workflows/test.yml/badge.svg?event=push)
Expand All @@ -11,29 +9,9 @@ A data centric platform for microscopy imaging

## How to use this?

This repo includes a client and server side for using our data centric platform. The client and server communicate via the [bentoml](https://www.bentoml.com/?gclid=Cj0KCQiApKagBhC1ARIsAFc7Mc6iqOLi2OcLtqMbGx1KrFjtLUEZ-bhnqlT2zWREE0x7JImhtNmKlFEaAvSSEALw_wcB) library. The client interacts with the server every time we run model inference or training. For full functionality of the software the server should be running. To install and start the server side follow the instructions described in [DCP Server Installation & Launch](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/main/src/server/README.md#using-pypi).
This repo includes a client and server side for using our data centric platform. The client and server communicate via the [bentoml](https://www.bentoml.com/?gclid=Cj0KCQiApKagBhC1ARIsAFc7Mc6iqOLi2OcLtqMbGx1KrFjtLUEZ-bhnqlT2zWREE0x7JImhtNmKlFEaAvSSEALw_wcB) library. The client interacts with the server every time we run model inference or training. For full functionality of the software the server should be running, either locally or remotely. To install and start the server side follow the instructions described in [DCP Server Installation & Launch](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/main/src/server/README.md#using-pypi).

To run the client GUI follow the instructions described in [DCP Client Installation & Launch](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/main/src/client/README.md).

### Toy data
This repo includes the ```data/``` directory with some toy data which you can use as the 'uncurated dataset' folder. You can create (empty) folders for the other two directories required in the welcome window.

## Customization (for developers)

All service configurations are set in the _src/server/dcp_server/config.cfg_ file. Please, obey the [formal JSON format](https://www.json.org/json-en.html).

The config file has to have the five main parts. All the ```marked``` arguments are mandatory:

- ``` setup ```
- ```segmentation ``` - segmentation type from the segmentationclasses.py. Currently, GeneralSegmentation and MitoProjectSegmentation are available.
- ```accepted_types``` - types of images currently accepted for the analysis
- ```seg_name_string``` - end string for masks to run on (All the segmentations of the image should contain this string - used to save and search for segmentations of the images)
- ```service```
- ```model_to_use``` - name of the model class from the models.py you want to use. Currently, only the CustomCellposeModel is available.
- ```save_model_path``` - path and and name for the trained model which will be saved after calling the (re)train from service
- ```runner_name``` - name of the runner for the bentoml service
- ```service_name``` - name for the service
- ```model``` - configuration for the model instatiation. Here, pass any arguments you need or want to change. Take care that the names of the arguments are the same as of original model class' _init()_ function!
- ```train``` - configuration for the model training. Here, pass any arguments you need or want to change or leave empty {}. Take care that the names of the arguments are the same as of original model's _train()_ function! If using cellpose - the _train()_ function arguments can be found [here](https://cellpose.readthedocs.io/en/latest/api.html#id7)
- ```eval``` - configuration for the model evaluation. Here, pass any arguments you need or want to change or leave empty {}. Take care that the names of the arguments are the same as of original model's _eval()_ function! If using cellpose - the _eval()_ function arguments can be found [here](https://cellpose.readthedocs.io/en/latest/api.html#id3).

This repo includes the ```data/``` directory with some toy data which you can use as the *Uncurated dataset* folder. You can create (empty) folders for the other two directories required in the welcome window and start playing around.
86 changes: 78 additions & 8 deletions src/client/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,87 @@
# DCP Client
A data centric platform for microscopy imaging
The client of our data centric platform for microscopy imaging.

[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#https://github.com/HelmholtzAI-Consultants-Munich/active-learning-platform)
![stability-wip](https://img.shields.io/badge/stability-work_in_progress-lightgrey.svg)

## Installation
## How to use?
### Installation
This installation has been tested using a conda environment with python version 3.9 on a mac local machine. In your dedicated environment run:
```
pip install -e .
```

## Launch GUI
Simply run:
```
python dcp_client/main.py
```
### Running the client: A step to step guide!
1. **Configurations**

Before launching the GUI you will need to set up your client configuration file, _dcp_client/config.cfg_. Please, obey the [formal JSON format](https://www.json.org/json-en.html). Here, we will define how the client will interact with the server. There are currently two options available: running the server locally, or connecting to the running instance on the FZJ jusuf-cloud. To connect to a locally running server, set:
```
"user": "local",
"host": "local",
"data-path": "None",
"ip": "localhost",
"port": 7010
}
```
To connect to the running service on jusuf-cloud, set:
```
"server":{
"user": "xxxxx",
"host": "xxxxxx",
"data-path": "xxxxx",
"ip": "xxx.xx.xx.xx",
"port": xxxx
}
```
Before continuing, you need to make sure that DCP server is running, either locally or on the cloud. See [DCP Server Installation & Launch](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/main/src/server/README.md#using-pypi) for instructions on how to launch the server. **Note:** In order for this connection to succeed, you will need to have contacted the team developing DCP, so they can add your IP to the list of accepted requests.


2. **Launching the client**

After setting your config simply run:
```
python dcp_client/main.py
```

3. **Welcome window**

The welcome window should have now popped up.

<img src="https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/documentation/src/client/readme_figs/client_welcome_window.png" width="400" height="200">

Here you will need to select the directories which we will be using throughout the data centric workflow. The following directories need to be defined:

* **Uncurated dataset path:** This folder should initially contain all images of your dataset. They may or may not be accompanied by corresponding segmentations, but if they do, the segmentations should have the same filename as the image followed by the ending defined in ```setup/seg_name_string```, deifned in ```server/dcp_server/config.cfg``` (default extension is _seg)
* **Curation in progress path:(Optional)** Images for which the segmentation is a work in progress should be moved here. Each image in this folder can have one or multiple segmentations corresponding to it (by changing the filename of the segmentation in the napari layer list after editing it, see **Viewer**). If you do not want to use an intermediate working dir, you can skip setting a path to this directory (it is not required). No future functions affect this directory, it is only used to move to and from the uncurated and curated directories.
* **Curated dataset path:** This folder should contain images along with their final segmentations. **Only** move images here when the segmentation is complete and finalised, you won't be able to change them after they have been moved here. These are then used for training your model.

4. **Setting paths**

After setting the paths for these three folders, you can click the **Start** button. If you have set the server configuration to the cloud, you will receive a message notifying you that your data will be uploaded to the cloud. Clik **Ok** to continue.

5. **Data Overview**

The main working window will appear next. This gives you an overview of the directories selected in the previous step along with three options:

* **Generate Labels:** Click this button to generate labels for all images in the "Uncurated dataset" directory. This will call the ```segment_image``` service from the server
* **View image and fix label:** Click this button to launch your viewer. The napari software is used for visualising, and editing the images segmentations. See **Viewer**
* **Train Model:** Click this model to train your model on the images in the "Curated dataset" directory. This will call the ```train``` service from the server
![Alt Text](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/documentation/src/client/readme_figs/client_data_overview_window.png)

6. **The viewer**

In DCP, we use [napari](https://napari.org/stable) for viewing our images and makss, adding, editing or removing labels. An example of the viewer can be seen below. After adding or removing any objects and editing existing objects wherever necessary, there are two options available:
- Click the **Move to Curation in progress folder** if you are not 100% certain about the labels you have created. You can also click on the label in the labels layer and change the name. This will result in several label files being created in the *In progress folder*, which can be examined later on.
- Click the **Move to Curated dataset folder** if you are certain that the labels you are now viewing are final and require no more curation. These images and labels will later be used for training the machine learning model, so make sure that you select this option only if you are certain about the labels. If several labels are displayed (opened from the 'Curation in progress' step), make sure to **click** on the single label in the labels layer list you wish to be moved to the *Curated data folder*. The other images will then be automatically deleted from this folder.

![Alt Text](https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/documentation/src/client/readme_figs/client_napari_viewer.png)

### Data centric workflow [intended usage summary]
The intended usage of DCP would include the following:
1. Setting up configuration, run client (with server already running) and select data directories
2. Generate labels for data in *Uncurated data folder*
3. Visualise the resulting labels with the viewer and correct labels wherever necessary - once done move the image *Curated data folder*. Repeat this step for a couple of images until a few are placed into the *Curated data folder*. Depending on the qualitative evaluation of the label generation you might want to include fewer or more images, i.e. if the resulting masks require few edits, then few images will most likely be sufficient, whereas if many edits to the mask are required it is likely that more images are needed in the *Curated data folder*. You can always start with a small number and adjust later
4. Train the model with the images in the *Curated data folder*
6. Repeat steps 2-4 until you are satisfied with the masks generated for the remaining images in the *Uncurated data folder*. Every time the model is trained in step 4, the masks generated in step 2 should be of higher quality, until the model need not be trained any more
<img src="https://github.com/HelmholtzAI-Consultants-Munich/data-centric-platform/blob/documentation/src/client/readme_figs/dcp_pipeline.png" width="200" height="200">


Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/client/readme_figs/client_napari_viewer.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/client/readme_figs/client_welcome_window.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added src/client/readme_figs/dcp_pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
48 changes: 41 additions & 7 deletions src/server/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
# DCP Server

The client and server communicate via the [bentoml](https://www.bentoml.com/?gclid=Cj0KCQiApKagBhC1ARIsAFc7Mc6iqOLi2OcLtqMbGx1KrFjtLUEZ-bhnqlT2zWREE0x7JImhtNmKlFEaAvSSEALw_wcB) library. The client interacts with the server every time we run model inference or training, so the server should be running before starting the client.
The server of our data centric platform for microscopy imaging.

![stability-wip](https://img.shields.io/badge/stability-work_in_progress-lightgrey.svg)

## Platforms
The client and server communicate via the [bentoml](https://www.bentoml.com/?gclid=Cj0KCQiApKagBhC1ARIsAFc7Mc6iqOLi2OcLtqMbGx1KrFjtLUEZ-bhnqlT2zWREE0x7JImhtNmKlFEaAvSSEALw_wcB) library. The client interacts with the server every time we run model inference or training, so the server should be running before starting the client.

### Using pypi
This has been tested using a conda environment with python version 3.9 on a mac local machine.
## How to use?

#### Installation
### Installation
In your dedicated environment run:
```
pip install -e .
Expand All @@ -20,13 +21,44 @@ python dcp_server/main.py
```
Once the server is running, you can verify it is working by visiting http://localhost:7010/ in your web browser.

### Docker --> Currently doesn't work for generate labels?
## Customization (for developers)

All service configurations are set in the _config.cfg_ file. Please, obey the [formal JSON format](https://www.json.org/json-en.html).

The config file has to have the five main parts. All the ```marked``` arguments are mandatory:

- ``` setup ```
- ```segmentation ``` - segmentation type from the segmentationclasses.py. Currently, only **GeneralSegmentation** is available (MitoProjectSegmentation and GFPProjectSegmentation are stale).
- ```accepted_types``` - types of images currently accepted for the analysis
- ```seg_name_string``` - end string for masks to run on (All the segmentations of the image should contain this string - used to save and search for segmentations of the images)
- ```service```
- ```model_to_use``` - name of the model class from the models.py you want to use. Currently, available models are:
- **CustomCellposeModel**: Inherits [CellposeModel](https://cellpose.readthedocs.io/en/latest/api.html#cellposemodel) class
- **CellposePatchCNN**: Includes a segmentor and a clasifier. Currently segmentor can only be ```CustomCellposeModel```, and classifier is ```CellClassifierFCNN```. The model sequentially runs the segmentor and then classifier, on patches of the objects to classify them.
- ```save_model_path``` - name for the trained model which will be saved after calling the (re)train from service - is saved under ```bentoml/models```
- ```runner_name``` - name of the runner for the bentoml service
- ```service_name``` - name for the bentoml service
- ```port``` - on which port to start the service
- ```model``` - configuration for the model instatiation. Here, pass any arguments you need or want to change. Take care that the names of the arguments are the same as of original model class' _init()_ function!
- ```segmentor```: model configuration for the segmentor. Currently takes argumnets used in the init of CellposeModel, see [here](https://cellpose.readthedocs.io/en/latest/api.html#cellposemodel).
- ```classifier```: model configuration for classifier, see _init()_ of ```CellClassifierFCNN```
- ```train``` - configuration for the model training. Take care that the names of the arguments are the same as of original model's _train()_ function!
- ```segmentor```: If using cellpose - the _train()_ function arguments can be found [here](https://cellpose.readthedocs.io/en/latest/api.html#id7). Here, pass any arguments you need or want to change or leave empty {}, then default arguments will be used.
- ```classifier```: train configuration for classifier, see _train()_ of ```CellClassifierFCNN```
- ```eval``` - configuration for the model evaluation.. Take care that the names of the arguments are the same as of original model's _eval()_ function!
- ```segmentor```: If using cellpose - the _eval()_ function arguments can be found [here](https://cellpose.readthedocs.io/en/latest/api.html#id3). Here, pass any arguments you need or want to change or leave empty {}, then default arguments will be used.
- ```classifier```: train configuration for classifier, see _eval()_ of ```CellClassifierFCNN```.
- ```mask_channel_axis```: If a multi-class instance segmentation model has been used, then the masks returned by the model should have two channels, one for the instance segmentation results and one indicating the obects class. This variable indicated at which dim the channel axis should be stored. Currently should be kept at 0, as this is the only way the masks can be visualised correcly by napari in the client.


## Running with Docker [DO NOT USE UNTIL ISSUE IS SOLVED]

### Docker --> Currently doesn't work for generate labels?

#### Docker-Compose
```
docker compose up
```

#### Docker Non-Interactively
```
docker build -t dcp-server .
Expand All @@ -39,3 +71,5 @@ docker build -t dcp-server .
docker run -it dcp-server bash
bentoml serve service:svc --reload --port=7010
```


0 comments on commit a9e0426

Please sign in to comment.