Zinc Database Downloader & Merger
Due to the Zinc15 & Zinc20 database website's limitations on downloading big dataset files in .sdf, .smi & other formats, this tool will help you to easily download any dataset you want based on your zinc ID's list file.
- requests
- colorama
- tqdm
Zinc ID's list:
To create your Zinc ID's list you can download the CSV file of your preferred dataset and convert it to a list TXT file or create the list TXT file manually. Be aware that the Zinc ID's list file must be created in .txt format. For more information on how the final file has to be you can take a look at the list.txt file existing in the project files.
Use the following command to run the script:
python zinc_downloader.py
- Choose between 15 & 20:
15
which refers to https://zinc15.docking.org database
20
which refers to https://zinc20.docking.org database
- Choose between SDF, SMI, CSV, XML or JSON:
sdf
which refers to the Structured Data File
smi
which refers to the SMILES File
csv
which refers to the Comma-separated values
xml or json
which refers to the other supported formats
- Default is list.txt(list.txt)
- It's recommended to move the list file next to the main script to prevent further errors
- Choose between yes or no.
The script creates two separate folders in its related folder at the end:
dataset
which contains all downloaded dataset molecules
merged_dataset
which contains one molecule created by merging molecules in the dataset folder
For further support and any questions, you can contact me via: