You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ORF calling on eukaryotes functions drastically different than for bacteria. Instead of using Eukaryotic proteins as references, rather run prodigal with prokaryotic and metagenomic settings on reference Eukaryotic genomes (not necessary for viral genomes) and use those as references. This is more likely to mimic what happens with eukaryotic contigs during metagenome analyses pipelines.
randomly cut into chunks of ~ 5kb, but also cut at stretches of "N"s (discard chunks that end up smaller than 200bp)
run prodigal, derepilicate proteins (95% identity? or 90% identity?) to reduce database size. always keep largest representative --> protein diamond db: eukaryotic-refprotein-db
extract all remaining chunks without any predicted CDS (non-coding reference chunks), dereplicate (95% or 90% identity?). always keep largest representative --> nucleotide blastn-db: eukaryotic-noncoding-chunks-db
The text was updated successfully, but these errors were encountered:
ORF calling on eukaryotes functions drastically different than for bacteria. Instead of using Eukaryotic proteins as references, rather run prodigal with prokaryotic and metagenomic settings on reference Eukaryotic genomes (not necessary for viral genomes) and use those as references. This is more likely to mimic what happens with eukaryotic contigs during metagenome analyses pipelines.
for this:
The text was updated successfully, but these errors were encountered: