Skip to content

Primer Adapter Considerations

shandley edited this page Jun 16, 2021 · 1 revision

Primers and Adapter Considerations

Rigorous primer and adapter removal is critical for virome analysis. Diligent removal of primers and adapters help prevent false-positive taxonomic assingment to sequences in reference databases that may not have been properly quality controlled to remove common adapters and primers, and assembly may suffer by building kmer graphs based around common primer/adapter sequences instead of biological sequence. Primer and adapter removal is relatively fast, so there is no reason to not be rigorous in contaminant removal.

Hecatomb Default Primers and Adapters

Primers and adapters (referred to jointly as non-biological contamination) used in the default hecatomb workflow are designed based on theRdA/B protocol used in the Handley lab. This protocol has proven effective at generating libraries representative of all 7 Baltimore classifications of viruses (single and double stranded RNA and DNA viral genomes). There are two main sources of non-biological contaminants using this protocol.

- PrimerB: This is the primer used for Round B amplificaiton of DNA and cDNA. We use a set of 24 primers all 16 bases in length which have been diversity balanced to assist in Illumina phasing.

- Adapters: Used to attach our fragments of interest to the Illumina glass slides. Required for any Illumina based sequencing. The default hecatomb workflow points at the [NEBNext DNA library kits](https://www.neb.com/nebnext-ultra-ii-fns-dna/nebnext-ultra-ii-for-dna-library-prep?gclid=EAIaIQobChMIg7GQyfeK8QIVf21vBB3FjAcbEAAYASABEgLclvD_BwE) instead of the more traditionally used TruSeq library adapters. We have found this to be much more cost effective than the TruSeq kits.

What to do if you use different primers and adapters?

What if I just have simple metagenomic sequencing without all those crazy primers and adapter-primer chimeras to deal with?

If you do not follow the RdA/B protocol approach and just have metagenomic sequencing you should not have the myriad of non-biological contaminants that arise during the multiple steps of RdA/B amplifcation and ligation (e.g primer-adapter chimeras, rogue primers and adapters, etc.). Probably the easiest thing to do in this case is to QC your sequences prior to running the hecatomb workflow. There are plenty of tools to do this. We recommend using PRINSEQ as it has ample options to suit almost any QC need and provides detailed sequene statistics. Alternatively tools such as Trimmomatic will probably work just fine.

Your custom QC'd sequences should be able to be processed through the full hecatomb workflow without issue (barring you aren't doing anything too strange). The only real sacrifice in doing so is that you will be running through several steps designed to remove the default primerB and NEBnext sequences which will be unneccessary for your sequences. However, these steps are fast (just a few seconds per step on most systems) and should not modify your data. You can always check the output logs to see if these steps inadvertently removed/modified your sequences unexpectedly, but it is unlikely. Alternatively, you can customize the hecatomb Snakefile and the 00_preprocessing.smk to skip these steps.

What to do if I use the RdA/B protocol but use different primers or adapters?

All you will need to do is replace the current default files (primerB, etc.) with files containing your own sequences. There are several that you will need to generate and replace. You can keep the same names (i.e your amplification primers should be in a file called primerB.fasta). There are several files you will need to consider:

- primerB.fa: This is the main primary file
- rc_primerB_ad6.fa: This is the reverse complement of the main primers (primerB.fa), but the sequences have been reverse-complemented and have had 6-bases of the reverse-complement of the adapter sequence attached.
- nebnext_adapters.fa Basic NEBNext adapter file. Can easily swap out with your labs own adapters.

So really the only work you may have to do other than replacing and possibly renaming your adapter and primer files is to create the rc_primerB_ad6.fa file then everything should work splenddily.

The only other possible considerations are if your primers are of a different length. For example, in the rule rule remove_5prime_primer: in 00_preprocessing.smk we set k=16 which is the length of our primerB. It might be worth going through and customizing some of these settings if your primers are different lengths.