Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to build db. #852

Open
abskumar opened this issue Jul 8, 2024 · 9 comments
Open

Unable to build db. #852

abskumar opened this issue Jul 8, 2024 · 9 comments

Comments

@abskumar
Copy link

abskumar commented Jul 8, 2024

I am getting error while running:
kraken2-build --standard --db kraken2_db

Error:
rsync: link_stat "/refseq/archaea/assembly_summary.txt" (in genomes) failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1861) [Receiver=3.2.7]
Error downloading assembly summary file for archaea, exiting.

Please help.

@ideshigenomics
Copy link

code: kraken2-build --standard --threads 12 --db ~/kraken2-master/Kraken2_DB

Error:
rsync: link_stat "/all/GCF/030/643/725/GCF_030643725.1_ASM3064372v1/GCF_030643725.1_ASM3064372v1_genomic.fna.gz" (in genomes) failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1865) [generator=3.2.7]
rsync_from_ncbi.pl: rsync error, exiting: 5888

@jenniferlu717
Copy link
Collaborator

This happens when NCBI hasnt updated their assembly_summary.txt file to remove/change links that no longer work. The build should work after a couple days unfortunately.

You can remove the line containing that link in the assembly_summary.txt file and try to redownload or you can manually download the libraries (or use krakenuniq-download)

@abskumar
Copy link
Author

abskumar commented Jul 9, 2024

It seems that the code is trying to download "assembly_summary.txt" files from various folders: archaea, viral, bacteria etc.
However, these files do not exist on NCBI site any more.
I believe NCBI has now a single composite file: https://ftp.ncbi.nlm.nih.gov/genomes/refseq/assembly_summary_refseq.txt
I am not sure how to process this.

@jenniferlu717
Copy link
Collaborator

Ah....Thanks for letting me know. We will have to update the kraken download scripts to accommodate this but this may take some time

@abskumar
Copy link
Author

abskumar commented Jul 9, 2024

Thanks.
Will appreciate an update message when this is done.

@cement-head
Copy link

KRAKEN2 is broken:

 ./kraken2-build --standard --threads 24 --db kraken2-std-db
Downloading taxonomy tree data... done.
Untarring taxonomy tree data... done.
rsync_from_ncbi.pl: unexpected FTP path (new server?) for https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/762/265/GCF_000762265.1_ASM76226v1

@josruirod
Copy link

Also struggling with this error, as in issue https://github.com/DerrickWood/kraken2/issues/847

jenniferlu717 has commented

This happens when NCBI hasnt updated their assembly_summary.txt file to remove/change links that no longer work. The build should work after a couple days unfortunately.

You can remove the line containing that link in the assembly_summary.txt file and try to redownload or you can manually download the libraries (or use krakenuniq-download)

Removing the line containing that genome in the assembly_summary.txt worked. I had to modify download_genomic_library.sh though, so assembly_summary is not recreated each execution of kraken2-build

@IrinaVKuznetsova
Copy link

IrinaVKuznetsova commented Jul 30, 2024

Has the issue been fixed?
Thank you.

@Ivalize
Copy link

Ivalize commented Aug 16, 2024

Hi,
@IrinaVKuznetsova you could try these two options:
#518

or (downloading and creating your own database)
https://benlangmead.github.io/aws-indexes/k2

I hope it helps you
Regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants