Skip to content

Commit 941874c

Browse files
authored
fixes ganon v1.6.0 (#251)
* docs, fix test sets * genome_updater v0.6.2, small fixes
1 parent 4472217 commit 941874c

File tree

10 files changed

+12
-12
lines changed

10 files changed

+12
-12
lines changed

.travis.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ before_install:
3131
- eval "${MATRIX_EVAL}"
3232
- python3 -m pip install --upgrade pip
3333
- python3 -m pip install "pandas>=1.1.0"
34-
- python3 -m pip install "multitax>=1.2.1"
34+
- python3 -m pip install "multitax>=1.3.1"
3535
- if [ "$BUILD_TYPE" == "Coverage" ]; then
3636
python3 -m pip install coverage;
3737
fi

docs/default_databases.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,8 +41,8 @@ NCBI RefSeq and GenBank repositories are common resources to obtain reference se
4141
|---|---|---|---|
4242
| Complete | 1595845 | | <details><summary>cmd</summary>`ganon build --source genbank --organism-group archaea bacteria fungi viral --threads 48 --db-prefix abfv_gb`</details> |
4343
| One assembly per species | 99505 | 91 - 420 | <details><summary>cmd</summary>`ganon build --source genbank --organism-group archaea bacteria fungi viral --threads 48 --genome-updater "-A 'species:1'" --db-prefix abfv_gb_t1s`</details> |
44-
| Complete genomes (higher quality) | 92917 | 24 - | <details><summary>cmd</summary>`ganon build --source genbank --organism-group archaea bacteria fungi viral --threads 48 --complete-genomes --db-prefix abfv_gb_cg`</details> |
45-
| One assembly per species of complete genomes | 34497 | 10 - | <details><summary>cmd</summary>`ganon build --source genbank --organism-group archaea bacteria fungi viral --threads 48 --complete-genomes "-A 'species:1'" --db-prefix abfv_gb_cg_t1s`</details> |
44+
| Complete genomes (higher quality) | 92917 | 24 - 132 | <details><summary>cmd</summary>`ganon build --source genbank --organism-group archaea bacteria fungi viral --threads 48 --complete-genomes --db-prefix abfv_gb_cg`</details> |
45+
| One assembly per species of complete genomes | 34497 | 10 - 34 | <details><summary>cmd</summary>`ganon build --source genbank --organism-group archaea bacteria fungi viral --threads 48 --complete-genomes "-A 'species:1'" --db-prefix abfv_gb_cg_t1s`</details> |
4646

4747
\* Size (GB) is the final size of the database and the approximate amount of RAM necessary to build it (calculated with default parameters). The two values represent databases built with and without the `--hibf` parameter, respectively. The trade-offs between those two modes are explained [here](#hibf).
4848

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ def read(filename):
1717
url="https://www.github.com/pirovc/ganon",
1818
license='MIT',
1919
author="Vitor C. Piro",
20-
description="ganon classifies short DNA sequences against large sets of genomic reference sequences efficiently",
20+
description="ganon classifies DNA sequences against large sets of genomic reference sequences efficiently",
2121
long_description=read("README.md"),
2222
package_dir={'': 'src'},
2323
packages=["ganon"],

src/ganon/config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -397,7 +397,7 @@ def validate(self):
397397
elif check_file(db_prefix + ".ibf"):
398398
ibf = True
399399
else:
400-
print_log("File not found: " + prefix + ".ibf/.hibf" )
400+
print_log("File not found: " + db_prefix + ".ibf/.hibf" )
401401
return False
402402

403403
if check_file(db_prefix + ".tax"):
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
07b534765d6b7d3e4d8bf67f549a5d66 build/releases/latest/ar53_taxonomy.tsv.gz
2+
70a673d332f60af1cf68e34d09a56816 build/releases/latest/bac120_taxonomy.tsv.gz

tests/ganon/data/build/releases/release207/207.0/MD5SUM

Lines changed: 0 additions & 2 deletions
This file was deleted.

tests/ganon/data/download_test_set_build.sh

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,14 +61,14 @@ md5sum "${outfld}pub/taxonomy/new_taxdump/new_taxdump.tar.gz" > "${outfld}pub/ta
6161
rm "${outfld}new_taxdump.tar.gz" "${outfld}taxidlineage.dmp" "${outfld}rankedlineage.dmp" "${outfld}pub/taxonomy/new_taxdump/taxidlineage.dmp" "${outfld}pub/taxonomy/new_taxdump/rankedlineage.dmp"
6262

6363
#gtdb
64-
gtdb_out="${outfld}releases/release207/207.0/"
64+
gtdb_out="${outfld}releases/latest/"
6565
mkdir -p "${gtdb_out}"
66-
gtdb_tax=( "ar53_taxonomy_r207.tsv.gz" "bac120_taxonomy_r207.tsv.gz" )
66+
gtdb_tax=( "ar53_taxonomy.tsv.gz" "bac120_taxonomy.tsv.gz" )
6767
for tax in "${gtdb_tax[@]}"; do
68-
wget --quiet --show-progress --output-document "${outfld}${tax}" "https://data.gtdb.ecogenomic.org/releases/release207/207.0/${tax}"
68+
wget --quiet --show-progress --output-document "${outfld}${tax}" "https://data.gtdb.ecogenomic.org/releases/latest/${tax}"
6969
join -1 1 -2 1 <(cut -f 1 "${outfld}accessions_taxids.txt" | sort) <(zcat "${outfld}${tax}" | awk 'BEGIN{FS=OFS="\t"}{print $1,$1,$2}' | sed -r 's/^.{3}//' | sort) -t$'\t' -o "2.2,2.3" | gzip > "${gtdb_out}${tax}"
7070
rm "${outfld}${tax}"
7171
done
7272

73-
md5sum ${gtdb_out}*.tsv.gz > "${gtdb_out}MD5SUM"
73+
md5sum ${gtdb_out}*.tsv.gz > "${gtdb_out}MD5SUM.txt"
7474
rm ${outfld}accessions_taxids.txt

0 commit comments

Comments
 (0)