This program makes it possible to sequence scaffold and staple strands, even if the cadnano file contains multiple separate scaffolds.
The program can be run using the following command:
python3 seq_designer.py <cadnano json file> <scaffold file>
For example:
python3 seq_designer.py json_files/test_virtual.json scaffold_files/M13mp18
The program will require two inputs as arguments:
- cadnano .json file
- scaffold sequence file - this sequence will be assigned to the longest scaffold strand in the caDNAnojson file. The other scaffold sequences will be pseudorandomly generated.
The program will generate three output files:
- scaffolds.txt - contains the sequences of the scaffold strands. Moreover, it contains the start and end location, and the length of each scaffold.
- staples.txt - contains the sequences of the staple strands. Moreover, it contains the start and end location, and the length of each staple.
- visualized_sequence.txt - contains a nicely formatted visualization of the scaffold and staple sequence data, analogous to the visual representation in cadnano. This might be useful for checking the final results.
Here is an example for the outputs using a small caDNAno file. (json_files/small_twobreak.json and M13mp18 scaffold, specifically).
Start,End,Sequence,Length
1[6],0[5],GTGATGATT,9
0[6],1[7],AATGCTACTAC,11
Start,End,Sequence,Length
1[2],1[11],ATCACGTAGT,10
0[11],0[2],AGCATTAATC,10
Scaffold 0 |--GATTAATGCT---------|
Staple 0 |--CTAATTACGA---------|
Staple 1 |--ATCACGTAGT---------|
Scaffold 1 |--TAGTGCATCA---------|