You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here are some notes after trying out USDA's GenoFLU tool to assign lineages (per-segment) and genotypes (per-genome):
This tool uses BLAST to identify North American H5NX genomes in the 2.3.4.4b clade from a curated database. Pre-defined genotypes are cross-referenced with the top segment identifications, and a genotype is assigned.
The tool is slightly inconvenient to use in the context of our pipeline as it requires a single fasta file per strain (genome), so we'd have to either create these on the fly, run GenoFLU, extract the results, and delete the temporary files or modify the tool to be more ergnomic for our usage. I couldn't install it via conda but used the provided docker image.
Example usage
mkdir results/genoflu
cd results/genoflu
# create a FASTA file for a specific strainecho'A/muteswan/Austria/23169070001/2023'> id.txt
SEGMENTS=("pb2""pb1""pa""ha""np""na""mp""ns")
echo> data.fasta
forsin${SEGMENTS[@]};do
seqkit grep -nf id.txt ../../data/gisaid/sequences_${s}.fasta | seqkit replace -p '$' -r "/${s}">> data.fasta
done;# run GenoFLU
docker container run --rm -it --mount type=bind,src=.,target=/avian-flu \
quay.io/biocontainers/genoflu:1.03--hdfd78af_0 \
bash -c "cd avian-flu/results/genoflu && genoflu.py -f data.fasta"
Sample results:
A/muteswan/Austria/23169070001/2023Genotype: Not assigned: Only 4 segments >98% match found of total 8 segments in input fileLineages: PB1:ea3, HA:ea3, NP:ea6, MP:ea3A/carrioncrow/Hokkaido/B081/2024/HAGenotype: A3Lineages: PB2:ea3, PB1:ea3, PA:ea3, HA:ea3, NP:ea3, NA:ea3, MP:ea3, NS:ea3A/Dairycattle/Kansas/5/202 (NCBI)Genotype: B3.13Lineages: PB2:am2.2, PB1:am4, PA:ea1, HA:ea1, NP:am8, NA:ea1, MP:ea1, NS:am1.1
The text was updated successfully, but these errors were encountered:
Here are some notes after trying out USDA's GenoFLU tool to assign lineages (per-segment) and genotypes (per-genome):
The tool is slightly inconvenient to use in the context of our pipeline as it requires a single fasta file per strain (genome), so we'd have to either create these on the fly, run GenoFLU, extract the results, and delete the temporary files or modify the tool to be more ergnomic for our usage. I couldn't install it via conda but used the provided docker image.
Example usage
Sample results:
The text was updated successfully, but these errors were encountered: