Preprint: https://arxiv.org/abs/2406.01455
Please also check our sample iOS app that utilizes the proposed model (uses an older version of our model): https://github.com/AlfredsLapkovskis/MultimodalPlantClassifier-iOS
We used Python 3.11.5.
Execute these commands from the project root directory:
Windows:
python -m venv env
env/bin/activate
pip install -r requirements.txt
Linux:
python -m venv env
source env/bin/activate
pip install -r requirements.txt
MacOS:
python -m venv env
source env/bin/activate
pip install -r requirements.txt
# Optional step to accelerate training on MacOS (https://developer.apple.com/metal/tensorflow-plugin/)
pip install -r requirements_macos.txt
We used PlantCLEF2016 (https://www.imageclef.org/lifeclef/2016/plant) as it contains all the data from PlantCLEF2015 in separate folders.
If you want to run dataset/stats.ipynb, you may want to download Pl@ntNet too (https://zenodo.org/records/4726653#.YhNbAOjMJPY).
Copy example_config.json into the project root directory. Name it as config.json.
Specify there:
- cache_dir: our code may use it to store there files to speed up some operations.
- plant_net_root: path to the root directory of Pl@netNet dataset on your computer.
- plant_clef_root: path to the root directory of PlantCLEF2015 dataset on your computer.
- plant_clef_train: path to the directory with train split of PlantCLEF2015 dataset (relative to plant_clef_root).
- plant_clef_test: path to the directory with test split of PlantCLEF2015 dataset (relative to plant_clef_root).
- working_dir: path to the directory where our code will store various artefacts, e.g., models, logs, etc.
- dataset directory contains everything related to data:
- stats.ipynb presents some dataset statistics.
- preprocessing.ipynb contains our PlantCLEF2015 dataset preprocessing pipeline.
- loading.py contains methods for loading datasets for our ML models.
- plant_net_meta.py a model of Pl@ntNet metadata.
- plant_clef_meta.py a model of PlantCLEF2015 metadata.
- data_loading_demo.ipynb demonstration of using loading.py.
- unimodal directory contains everything related to our unimodal models:
- experiment.py a class containing hyperparameters and settings for unimodal model training.
- experiments a directory containing JSON files with experiment hyperparameters and settings. These files are parsed by experiment.py and must be named as exp<index>.json.
- train.py a script to train our unimodal models. Usage:
python -m unimodal.train -e <index of experiment>
. Optionally, add-s <save mode>
to save the trained models. For available save mode options, see: common/save_mode.py.
- multimodal directory contains everything related to our multimodal model:
- run_mfas.py a script to run our implementation of MFAS algorithm based on the original paper (Perez-Rua et al., 2019) and the author's source code.
- experiment.py a class containing hyperparameters and settings for multimodal model training. We use it once we have found optimal configurations by run_mfas.py.
- experiments a directory containing JSON files with experiment hyperparameters and settings. These files are parsed by experiment.py and must be named as exp<index>.json. We use it once we have found optimal configurations by run_mfas.py.
- train.py a script to train our multimodal model. We use it once we have found an optimal configuration by run_mfas. Usage:
python -m unimodal.train -e <index of experiment>
. Optionally, add-s <save mode>
to save the trained models. For available save mode options, see: common/save_mode.py. - classes contains all the classes used in our MFAS implementation (including mfas.py, the algorithm itself) and multimodal experiments.
- evaluation contains the code used for model evaluation:
- evaluate_model.py basic evaluation of unimodal models, our multimodal model or our baseline.
- mcnemar_test.py McNemar's test to detect the statistical significance of difference between the proposed model and the baseline.
- subsets_of_modalities.py script to compare the final model with unimodal models and the baseline on subsets of modalities.
- utils various utilities for the evaluation.
- common various helpers and constants used in the project.
- convert utilities to convert models to other formats:
- convert_to_coreml.py script to covert models to Apple CoreML format.
- resources various resources produced by our code:
- models our trained models. train directory contains models trained on training set only, whereas train+validation contains models trained on merged training and validation sets.
First start with preprocessing.ipynb to preprocess the data, and generate all the necessary files. Then you are ready to experiment with unimodal architectures, or multimodal architectures, including MFAS.
Execute all the code from the root project directory. Please check source files for expected parameters. For scripts that require modalities use Flower Leaf Fruit Stem, if a script also requires paths for the corresponding models, input them in the same order, for example:
python -m evaluation.evaluate_model \
--mode late_fusion \
--modalities Flower Leaf Fruit Stem \
--paths path/to/Flower/model.keras \
path/to/Leaf/model.keras \
path/to/Fruit/model.keras \
path/to/Stem/model.keras