Skip to content

Commit

Permalink
Merge branch 'master' into plot_black
Browse files Browse the repository at this point in the history
  • Loading branch information
smilesun authored Oct 11, 2024
2 parents bb91107 + 8ea9aa2 commit 382296b
Show file tree
Hide file tree
Showing 396 changed files with 2,202 additions and 22,957 deletions.
14 changes: 10 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ jobs:
- name: test if api works
run: poetry run python examples/api/jigen_dann_transformer.py
- name: Generate coverage report
run: rm -r zoutput && poetry run pytest --cov=domainlab tests/ --cov-report=xml
run: rm -rf zoutput && poetry run pytest --maxfail=1 -vvv --tb=short --cov=domainlab tests/ --cov-report=xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v1
with:
Expand All @@ -42,8 +42,14 @@ jobs:
env:
CODECOV_TOKEN: 02ecb3ac-b7ce-4ea4-90a2-961c3d1a1030
- name: check if readme yaml works
run: rm -r zoutput && python main_out.py -c ./examples/conf/vlcs_diva_mldg_dial.yaml
run: rm -rf zoutput && python main_out.py -c ./examples/conf/vlcs_diva_mldg_dial.yaml
- name: test if examples in markdown works
run: bash -x -v ci_run_examples.sh
run: bash -x -v scripts/ci_run_examples.sh
- name: test if benchmark works
run: pip install snakemake==7.32.0 && pip install pulp==2.7.0 && sed -i '1s/^/#!\/bin\/bash -x -v\n/' run_benchmark_standalone.sh && bash -x -v run_benchmark_standalone.sh examples/benchmark/demo_shared_hyper_grid.yaml && cat zoutput/benchmarks/mnist_benchmark_grid/hyperparameters.csv && cat zoutput/benchmarks/mnist_benchmark_grid/results.csv
run: |
pip install snakemake==7.32.0 && pip install pulp==2.7.0
echo "insert a shebang line (#!/bin/bash -x -v) at the beginning of the bash script"
sed -i '1s/^/#!\/bin\/bash -x -v\n/' run_benchmark_standalone.sh
bash -x -v run_benchmark_standalone.sh examples/benchmark/demo_shared_hyper_grid.yaml
cat zoutput/benchmarks/mnist_benchmark_grid*/hyperparameters.csv
cat zoutput/benchmarks/mnist_benchmark_grid*/results.csv
12 changes: 9 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
.ropeproject
./zdpath
./zoutput
/zdpath
/zoutput
tests/__pycache__/
*.pyc
.vscode/
data/pacs
domainlab/zdata/pacs
/data/
/.snakemake/
/dist
/domainlab.egg-info
/runs
/slurm_errors.txt
67 changes: 49 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,24 +8,37 @@

## Distribution shifts, domain generalization and DomainLab

Neural networks trained using data from a specific distribution (domain) usually fails to generalize to novel distributions (domains). Domain generalization aims at learning domain invariant features by utilizing data from multiple domains (data sites, corhorts, batches, vendors) so the learned feature can generalize to new unseen domains (distributions).
Neural networks trained using data from a specific distribution (domain) usually fail to generalize to novel distributions (domains). Domain generalization aims at learning domain invariant features by utilizing data from multiple domains (data sites, cohorts, batches, vendors) so the learned feature can be generalized to new unseen domains (distributions).

DomainLab is a software platform with state-of-the-art domain generalization algorithms implemented, designed by maximal decoupling of different software components thus enhances maximal code reuse.
DomainLab is a software platform with state-of-the-art domain generalization algorithms implemented and designed by maximal decoupling of different software components thus enhancing maximal code reuse.

### DomainLab
DomainLab decouples the following concepts or objects:
- task $M$: In DomainLab, a task is a container for datasets from different domains. (e.g. from distribution $D_1$ and $D_2$). Task offer a static protocol to evaluate the generalization performance of a neural network: which dataset(s) is used for training, wich dataset(s) used for testing.
- neural network: a map $\phi$ from the input data to the feature space and a map $\varphi$ from feature space to output $\hat{y}$ (e.g. decision variable).
- task $M$: In DomainLab, a task is a container for datasets from different domains. (e.g. from distribution $D_1$ and $D_2$). The task offers a static protocol to evaluate the generalization performance of a neural network: which dataset(s) is used for training, and which dataset(s) is used for testing.
- neural network: a map $\phi$ from the input data to the feature space and a map $\varphi$ from the feature space to output $\hat{y}$ (e.g. decision variable).
- model: structural risk in the form of $\ell() + \mu R()$ where
- $\ell(Y, \hat{y}=\varphi(\phi(X)))$ is the task specific empirical loss (e.g. cross entropy for classification task).
- $\ell(Y, \hat{y}=\varphi(\phi(X)))$ is the task-specific empirical loss (e.g. cross entropy for classification task).
- $R(\phi(X))$ is the penalty loss to boost domain invariant feature extraction using $\phi$.
- $\mu$ is the corresponding multiplier to each penalty function factor.
- trainer: an object that guides the data flow to model and append further domain invariant losses
like inter-domain feature alignment.

We offer detailed documentation on how these models and trainers work in our documentation page: https://marrlab.github.io/DomainLab/
We offer detailed documentation on how these models and trainers work on our documentation page: https://marrlab.github.io/DomainLab/

DomainLab makes it possible to combine models with models, trainers with models, and trainers with trainers in a decorator pattern like line of code `Trainer A(Trainer B(Model C(Model D(network E), network E, network F)))` which correspond to $\ell() + \mu_a R_a() + \mu_b R_b + \mu_c R_c() + \mu_d R_d()$, where Model C and Model D share neural network E, but Model C has an extra neural network F. All models share the same neural network for feature extraction, but can have different auxilliary networks for $R()$.
DomainLab makes it possible to combine models with models, trainers with models, and trainers with trainers in a decorator pattern like the line of code below

```
Trainer A(
Trainer B(Model C(
Model D(network E),
network E,
network F
)
)
)
```

which correspond to $\ell() + \mu_a R_a() + \mu_b R_b + \mu_c R_c() + \mu_d R_d()$, where Model C and Model D share neural network E, but Model C has an extra neural network F. All models share the same neural network for feature extraction, but can have different auxiliary networks for $R()$.

<div style="align: center; text-align:center;">
<figure>
Expand All @@ -36,7 +49,7 @@ DomainLab makes it possible to combine models with models, trainers with models,
## Getting started

### Installation
For development version in Github, see [Installation and Dependencies handling](./docs/doc_install.md)
For the development version in Github, see [Installation and Dependencies handling](./docs/doc_install.md)

We also offer a PyPI version here https://pypi.org/project/domainlab/ which one could install via `pip install domainlab` and it is recommended to create a virtual environment for it.

Expand All @@ -45,10 +58,28 @@ We offer various ways for the user to specify a scenario to evaluate the general
[Task Specification](./docs/doc_tasks.md)

### Example and usage
#### Available arguments for commandline

#### Either clone this repo and use command line
The following command tells which arguments/hyperparameters/multipliers are available to be set by the user and which model they are associated with

```shell
python main_out.py --help
```

or

```
domainlab --help
```

#### Command line configuration file

`domainlab -c ./examples/conf/vlcs_diva_mldg_dial.yaml` (if you install via pip)

or if you clone this the code repository for DomainLab

`python main_out.py -c ./examples/conf/vlcs_diva_mldg_dial.yaml`

where the configuration file below can be downloaded [here](https://raw.githubusercontent.com/marrlab/DomainLab/master/examples/conf/vlcs_diva_mldg_dial.yaml)
```
te_d: caltech # domain name of test domain
Expand Down Expand Up @@ -77,7 +108,7 @@ One could simply run
`bash run_benchmark_slurm.sh your_benchmark_configuration.yaml` to launch different experiments with specified configuraiton.


For example, the following result (without any augmentation like flip) is for PACS dataset using ResNet.
For example, the following result (without any augmentation like flip) is for PACS dataset using ResNet. The reader should note that using different neural network, whether pre-trained or not, what kind of preprocessinga and augmentation to use can lead to very different result distributions, which is one of the features DomainLab provide: the above factors get decoupled in DomainLab.

<div style="align: center; text-align:center;">
<figure>
Expand All @@ -89,15 +120,15 @@ For example, the following result (without any augmentation like flip) is for P
</div>


### Temporary citation
### Citation

Source: https://arxiv.org/pdf/2403.14356.pdf

```bibtex
@manual{domainlab,
title={{DomainLab: modular python package for training domain invariant neural networks}},
author={{Xudong Sun, et.al.}},
organization={{Institute of AI for Health}},
year={2023},
url={https://github.com/marrlab/DomainLab},
note={temporary citation for domainlab}
@misc{sun2024domainlab,
title={DomainLab: A modular Python package for domain generalization in deep learning},
author={Sun, Xudong and Feistner, Carla and Gossmann, Alexej and Schwarz, George and Umer, Rao Muhammad and Beer, Lisa and Rockenschaub, Patrick and Shrestha, Rahul Babu and Gruber, Armin and Chen, Nutan and others},
journal={https://arxiv.org/pdf/2403.14356.pdf},
year={2024}
}
```
1 change: 0 additions & 1 deletion data/mixed_codec/caltech/auto/text.txt

This file was deleted.

Binary file removed data/mixed_codec/caltech/auto/train_imgs_150.jpg
Binary file not shown.
Binary file removed data/mixed_codec/caltech/auto/train_imgs_151.jpg
Binary file not shown.
Binary file removed data/mixed_codec/caltech/auto/train_imgs_152.png
Binary file not shown.
Binary file removed data/mixed_codec/caltech/vogel/train_imgs_1.jpg
Binary file not shown.
Binary file removed data/mixed_codec/caltech/vogel/train_imgs_2.jpg
Binary file not shown.
Binary file removed data/mixed_codec/caltech/vogel/train_imgs_3.png
Binary file not shown.
Binary file removed data/mixed_codec/sun/sofa/train_imgs_609.jpg
Binary file not shown.
Binary file removed data/mixed_codec/sun/sofa/train_imgs_612.jpg
Binary file not shown.
Binary file removed data/mixed_codec/sun/vehicle/train_imgs_17.jpg
Binary file not shown.
Binary file removed data/mixed_codec/sun/vehicle/train_imgs_19.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/dog/pic_195.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/dog/pic_304.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/elephant/pic_026.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/giraffe/pic_243.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/guitar/pic_020.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/guitar/pic_182.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/horse/pic_142.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/person/pic_165.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/person/pic_199.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/art_painting/person/pic_497.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/cartoon/dog/pic_112.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/cartoon/dog/pic_137.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/cartoon/dog/pic_219.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/cartoon/elephant/pic_332.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/cartoon/giraffe/pic_377.jpg
Binary file not shown.
Binary file removed data/pacs_mini_10/cartoon/giraffe/pic_382.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/cartoon/horse/pic_064.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/cartoon/house/pic_040.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/cartoon/person/pic_111.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/cartoon/person/pic_180.jpg
Diff not rendered.
7 changes: 0 additions & 7 deletions data/pacs_mini_10/main.sh

This file was deleted.

Binary file removed data/pacs_mini_10/photo/dog/n02103406_1011.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/photo/elephant/n02503517_6232.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/photo/guitar/n02676566_7830.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/photo/horse/105_0223.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/photo/house/pic_046.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/photo/house/pic_110.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/photo/house/pic_146.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/photo/house/pic_218.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/photo/person/253_0221.jpg
Diff not rendered.
Binary file removed data/pacs_mini_10/photo/person/253_0297.jpg
Diff not rendered.
26 changes: 0 additions & 26 deletions data/pacs_mini_10/run_with_bash_copy_files_according2txt.sh

This file was deleted.

Binary file removed data/pacs_mini_10/sketch/dog/5302.png
Diff not rendered.
Binary file removed data/pacs_mini_10/sketch/dog/5317.png
Diff not rendered.
Binary file removed data/pacs_mini_10/sketch/dog/n02103406_3255-6.png
Diff not rendered.
Binary file removed data/pacs_mini_10/sketch/elephant/5981.png
Diff not rendered.
Diff not rendered.
Diff not rendered.
Binary file removed data/pacs_mini_10/sketch/guitar/n02676566_8618-2.png
Diff not rendered.
Binary file removed data/pacs_mini_10/sketch/guitar/n03467517_6423-3.png
Diff not rendered.
Diff not rendered.
Binary file removed data/pacs_mini_10/sketch/house/8873.png
Diff not rendered.
10 changes: 0 additions & 10 deletions data/pacs_split/art_painting_10.txt

This file was deleted.

100 changes: 0 additions & 100 deletions data/pacs_split/art_painting_100.txt

This file was deleted.

Loading

0 comments on commit 382296b

Please sign in to comment.