Skip to content

Commit

Permalink
add hardware requirements and performance
Browse files Browse the repository at this point in the history
  • Loading branch information
KasperSkytte committed Feb 26, 2024
1 parent f5aaa6d commit fe456d8
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,14 @@ The required data must be in the typical amplicon data format with an abundance
Use the conda `environment.yml` file to create an environment with the required software. To installed required R packages, use the `renv.lock` file to restore the R library using the [`renv`](https://rstudio.github.io/renv/articles/renv.html) package.
For GPU support ensure you have a version of Tensorflow that matches your nvidia drivers and CUDA.

### Hardware requirements and performance
The workflow can run on a standard laptop just fine (as of 2023), but may require extra RAM and a NVIDIA GPU if you really need extra speed, however many other steps in the implementation are the bottlenecks, it's not the model training time itself. Typical processing time is 4-8 hours per dataset under `data/datasets`. Here are some hardware guidelines:

- 4 cores/8 threads
- 16GB RAM, preferably 32GB depending on input data
- 100GB storage space
- (not required) NVIDIA GPU with CUDA support

## Usage
Adjust the settings in `config.json` and then run the wrapper script `run.bash`. This will run `reformat.R` to first sort, filter, and format the data, look up known Genus-level functions on the [midasfieldguide.org](https://midasfieldguide.org) etc, and then run `main.py` which will start model training and evaluation.

Expand Down

0 comments on commit fe456d8

Please sign in to comment.