|
17 | 17 |
|
18 | 18 | [](https://nfcore.slack.com/channels/deepmodeloptim)[](https://twitter.com/nf_core)[](https://mstdn.science/@nf_core)[](https://www.youtube.com/c/nf-core)
|
19 | 19 |
|
| 20 | +## 📌 **Quick intro** check out this 👉🏻 [video](https://www.youtube.com/watch?v=dC5p_tXQpEs&list=PLPZ8WHdZGxmVKQga4KE15YVt95i-QXVvE&index=25)! |
| 21 | + |
20 | 22 | ## Introduction
|
21 | 23 |
|
22 |
| -**nf-core/deepmodeloptim** is a bioinformatics end-to-end pipeline that performs statistically testing training procedures of machine learning models. |
| 24 | +**nf-core/deepmodeloptim** is a bioinformatics end-to-end pipeline designed to facilitate the testing and development of deep learning models for genomics. |
| 25 | + |
| 26 | +Deep learning model development in natural science is an empirical and costly process. Despite the existence of generic tools for the tuning of hyperparameters and the training of the models, the connection between these procedures and the impact coming from the data is often underlooked, or at least not easily automatized. Indeed, researchers must define a pre-processing pipeline, an architecture, find the best parameters for said architecture and iterate over this process, often manually. |
| 27 | + |
| 28 | +Leveraging the power of Nextflow (polyglotism, container integration, scalable on the cloud), this pipeline will help users to 1) automatize the testing of the model, 2) gain useful insights with respect to the learning behaviour of the model, and hence 3) accelerate the development. |
23 | 29 |
|
24 |
| -Deep learning model development in natural science is an empirical and costly process. Users must define a pre-processing pipeline, an architecture, find the best parameters for said architecture and iterate over this process. |
| 30 | +## Pipeline summary |
25 | 31 |
|
26 |
| -Leveraging the power of Nextflow (polyglotism, container integration, scalable on the cloud), we propose STIMULUS, an open-source software built to automatize deep learning model development for genomics. |
| 32 | +It takes as input: |
27 | 33 |
|
28 |
| -STIMULUS takes as input a user defined PyTorch model, a dataset, a configuration file to describe the pre-processing steps to be performed, and a range of parameters for the PyTorch model. It then transforms the data according to all possible pre-processing steps, finds the best architecture parameters for each of the transformed datasets, performs sanity checks on the models and train a minimal deep learning version for each dataset/architecture. |
| 34 | +- A dataset |
| 35 | +- A configuration file to describe the data pre-processing steps to be performed |
| 36 | +- An user defined PyTorch model |
| 37 | +- A configuration file describing the range of parameters for the PyTorch model |
| 38 | + |
| 39 | +It then transforms the data according to all possible pre-processing steps, finds the best architecture parameters for each of the transformed datasets, performs sanity checks on the models and train a minimal deep learning version for each dataset/architecture. |
29 | 40 |
|
30 | 41 | Those experiments are then compiled into an intuitive report, making it easier for scientists to pick the best design choice to be sent to large scale training.
|
31 | 42 |
|
@@ -265,9 +276,18 @@ The `experiment_config` is a mandatory input for the pipeline and can be passed
|
265 | 276 | Update the author list
|
266 | 277 | -->
|
267 | 278 |
|
268 |
| -nf-core/deepmodeloptim was originally written by Mathys Grapotte. |
| 279 | +nf-core/deepmodeloptim was originally written by Mathys Grapotte ([@mathysgrapotte](https://github.com/mathysgrapotte)). |
| 280 | + |
| 281 | +We would like to thank to all the contributors for their extensive assistance in the development of this pipeline, who include (but not limited to): |
| 282 | + |
| 283 | +- Alessio Vignoli ([@alessiovignoli](https://github.com/alessiovignoli)) |
| 284 | +- Suzanne Jin ([@suzannejin](https://github.com/suzannejin)) |
| 285 | +- Luisa Santus ([@luisas](https://github.com/luisas)) |
| 286 | +- Jose Espinosa ([@JoseEspinosa](https://github.com/JoseEspinosa)) |
| 287 | +- Evan Floden ([@evanfloden](https://github.com/evanfloden)) |
| 288 | +- Igor Trujnara ([@itrujnara](https://github.com/itrujnara)) |
269 | 289 |
|
270 |
| -We thank the following people for their extensive assistance in the development of this pipeline: |
| 290 | +Special thanks for the artistic work on the logo to Maxime ([@maxulysse](https://github.com/maxulysse)), Suzanne ([@suzannejin](https://github.com/suzannejin)), Mathys ([@mathysgrapotte](https://github.com/mathysgrapotte)) and, not surprisingly, ChatGPT. |
271 | 291 |
|
272 | 292 | <!-- TODO nf-core: If applicable, make list of people who have also contributed -->
|
273 | 293 |
|
|
0 commit comments