ENA Sequence Database Downloader

ENASequence-Downloader is a script designed for seamless downloading of FASTQ sequences and metadata from the European Nucleotide Archive (ENA). Developed in Python, this tool features a user-friendly command-line interface and offers faster download speeds compared to conventional methods such as wget or curl.

License

This script is licensed under the Apache License, Version 2.0. You are free to use, modify, and distribute this project, provided that you comply with the terms of the license. This license includes conditions for redistribution and requires proper attribution. For more details, please refer to the LICENSE file or visit Apache License 2.0.

Features

Easy to Use: simplifies the process of downloading FASTQ sequences and metadata from ENA.
Compatibility: optimized for Unix operating systems and maintains good compatibility with Windows.
Fast Load Speed: increase download speeds compared to traditional tools like wget on Unix systems.
Customizable: easily adjustable, allowing users to tailor the tool for specific download requirements.

Requirements

To install the necessary Python packages, run: pip install -r requirements.txt This will install the following dependencies:

python >= 3.10
aria2 >= 1.36.0

Options

 -if, --ifile DIR_TO_FILE                  Input list accession number by text file (*.txt, *.csv, .etc).
                                           Each accession should be written into separate lines.
 -il, --ilist STRING                       Input list accession number by string.
                                           Each accession should be separated by comma, without any spaces between. For example: accession_1,accession_2,...,accession_N.
 -o, --output DIR                          Path to output directory.
 -m, --meta_file NAME                      Change the name of metadata `*.csv` file output.
 -op, --download_option NUMBER             Change download option (default = 0).
                                           0: Download both `fastq.gz` files and metadata
                                           1: Download only metadata
                                           2: Download only `fastq.gz` files.

How to cite

@misc{ENAsequences_download,
  author = {Vinh Chau and Ton Ngoc Minh Quan},
  title = {ENAsequences-Downloader},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/chauvinhtth13/ENAsequences-Downloader}},
}

Feel free to explore the different versions and choose the one that best fits your needs. Happy coding!

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
ENAsequence_Downloader.py		ENAsequence_Downloader.py
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ENA Sequence Database Downloader

License

Features

Requirements

Options

How to cite

About

Releases

Packages

Contributors 2

Languages

License

chauvinhtth13/ENAsequence-downloader

Folders and files

Latest commit

History

Repository files navigation

ENA Sequence Database Downloader

License

Features

Requirements

Options

How to cite

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages