Skip to content

Commit

Permalink
Merge branch 'master' into addlusplus
Browse files Browse the repository at this point in the history
  • Loading branch information
rnmitchell committed Oct 31, 2023
2 parents 04953e6 + 632174e commit 1954c19
Show file tree
Hide file tree
Showing 16 changed files with 10,010 additions and 51 deletions.
18 changes: 7 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -222,11 +222,11 @@ One additional argument can be provided with ```lusstr config --snps```:
___
## Running the lusSTR SNP workflow

The lusSTR SNP workflow consists of three steps:
The lusSTR SNP workflow consists of two steps:
(1) ```format```: formatting input and calling alleles if using STRait Razor data
(2) ```convert```: applying analytical threshold; converting data to correct format for input into EuroForMix;
(2) ```convert```: applying analytical threshold and converting data to correct format for input into EuroForMix

Any or all steps can be run. In order to run all three steps, the following command can be used:
Either the first step, format, or both steps can be run. In order to run both steps, the following command can be used:
```
lusstr snps all
```
Expand All @@ -238,16 +238,12 @@ The default working directory is the current directory.
lusstr snps all -w lusstr_files/
```

Individual steps can also be run
Individual format step can also be run
```
lusstr snps format -w lusstr_files/
```

```
lusstr snps convert -w lusstr_files/
```

**In order to run the ```convert``` step, the appropriately formatted ```.csv``` file containing the sequences normally created in the ```format``` step must be present in the working directory. See the below ```Usage``` section for specific information about that file (required columns, etc.).**
**In order to run the ```all``` step, the appropriately formatted ```.csv``` file containing the sequences normally created in the ```format``` step must be present in the working directory. See the below ```Usage``` section for specific information about that file (required columns, etc.).**

----

Expand All @@ -256,7 +252,7 @@ lusstr snps convert -w lusstr_files/

### Formatting input for SNP data

If inputting data from either the UAS Sample Details Report/Phenotype Report/Sample Report or STRait Razor output, the user must first invoke the ```format``` step to extract necessary information and format for the ```convert``` step.
If inputting data from either the UAS Sample Details Report/Phenotype Report/Sample Report or STRait Razor output, the user must first invoke the ```format``` step to extract necessary information and format for the final ```convert``` step.

The ```format``` command removes unnecessary rows/columns and outputs a table in CSV format containing the following columns:
* Sample ID
Expand All @@ -271,7 +267,7 @@ The ```format``` command removes unnecessary rows/columns and outputs a table in

### Converting to appropriately formatted files for use in EuroForMix

This step will convert the table generated in the ```format``` step into the correct format for use in EuroForMix. An analytical threshold can be applied (this is especially useful for data analyzed using STRait Razor) in this step.
This final step will convert the table generated in the ```format``` step into the correct format for use in EuroForMix. An analytical threshold can be applied (this is especially useful for data analyzed using STRait Razor) in this step.

If any samples are to be used as references, their IDs can be provided in the config file to create a separate file appropriately formatted for use as reference profiles in EFM. Any samples not specified as references are assumed to be evidence samples and will be formatted as such.

Expand Down
2 changes: 1 addition & 1 deletion lusSTR/data/snp_data.json
Original file line number Diff line number Diff line change
Expand Up @@ -623,7 +623,7 @@
"ReverseCompNeeded": "Yes",
"Coord": 50
},
"rs312262906_N29insA": {
"rs312262906": {
"Type": "p",
"Alleles": ["C", "insA"],
"ReverseCompNeeded": "No",
Expand Down
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -1911,7 +1911,6 @@ Kin_pos_1ng_set1 rs543502 3 4 42.0 33.0 75.0
Kin_pos_1ng_set1 rs6671673 4 2 45.0 30.0 75.0
Kin_pos_1ng_set1 rs7176165 4 75.0 75.0
Kin_pos_1ng_set1 rs7706034 1 75.0 75.0
Kin_pos_1ng_set1 N29insA 2 76.0 76.0
Kin_pos_1ng_set1 rs1009930 1 76.0 76.0
Kin_pos_1ng_set1 rs10167782 4 76.0 76.0
Kin_pos_1ng_set1 rs10196560 2 76.0 76.0
Expand Down Expand Up @@ -1949,6 +1948,7 @@ Kin_pos_1ng_set1 rs2723696 1 76.0 76.0
Kin_pos_1ng_set1 rs2816999 4 2 40.0 36.0 76.0
Kin_pos_1ng_set1 rs295340 1 76.0 76.0
Kin_pos_1ng_set1 rs3018845 2 4 44.0 32.0 76.0
Kin_pos_1ng_set1 rs312262906 2 76.0 76.0
Kin_pos_1ng_set1 rs340828 3 1 46.0 30.0 76.0
Kin_pos_1ng_set1 rs369005 2 4 39.0 37.0 76.0
Kin_pos_1ng_set1 rs3923451 4 76.0 76.0
Expand Down
2 changes: 1 addition & 1 deletion lusSTR/tests/data/kinsnps/evidence.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
Sample Name Marker Allele 1 Allele 2 Height 1 Height 2
Kin_pos_1ng N29insA 2 76.0
Kin_pos_1ng rs1000022 2 4 176.0 142.0
Kin_pos_1ng rs1000137 4 2 66.0 36.0
Kin_pos_1ng rs10002268 4 365.0
Expand Down Expand Up @@ -4521,6 +4520,7 @@ Kin_pos_1ng rs3117915 1 3 85.0 117.0
Kin_pos_1ng rs3118520 3 554.0
Kin_pos_1ng rs312154 4 3 69.0 51.0
Kin_pos_1ng rs312185 1 2 152.0 146.0
Kin_pos_1ng rs312262906 2 76.0
Kin_pos_1ng rs312272 2 32.0
Kin_pos_1ng rs3124028 4 96.0
Kin_pos_1ng rs3124041 3 221.0
Expand Down
4 changes: 2 additions & 2 deletions lusSTR/tests/data/kinsnps/multiplerefs.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
Sample Name Marker Allele 1 Allele 2
Kin_pos_reference N29insA 2 2
Kin_pos_reference rs1000022 2 4
Kin_pos_reference rs1000137 4 2
Kin_pos_reference rs10002268 4 4
Expand Down Expand Up @@ -4521,6 +4520,7 @@ Kin_pos_reference rs3117915 1 3
Kin_pos_reference rs3118520 3 3
Kin_pos_reference rs312154 4 3
Kin_pos_reference rs312185 1 2
Kin_pos_reference rs312262906 2 2
Kin_pos_reference rs312272 2 2
Kin_pos_reference rs3124028 4 4
Kin_pos_reference rs3124041 3 3
Expand Down Expand Up @@ -9235,7 +9235,6 @@ Kin_pos_reference rs999717 1 2
Kin_pos_reference rs999813 4 3
Kin_pos_reference rs9999446 1 3
Kin_pos_reference rs9999662 1 1
Kin_pos_1ng N29insA 2 2
Kin_pos_1ng rs1000022 2 4
Kin_pos_1ng rs1000137 4 2
Kin_pos_1ng rs10002268 4 4
Expand Down Expand Up @@ -13757,6 +13756,7 @@ Kin_pos_1ng rs3117915 1 3
Kin_pos_1ng rs3118520 3 3
Kin_pos_1ng rs312154 4 3
Kin_pos_1ng rs312185 1 2
Kin_pos_1ng rs312262906 2 2
Kin_pos_1ng rs312272 2 2
Kin_pos_1ng rs3124028 4 4
Kin_pos_1ng rs3124041 3 3
Expand Down
2 changes: 1 addition & 1 deletion lusSTR/tests/data/kinsnps/reference.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
Sample Name Marker Allele 1 Allele 2
Kin_pos_reference N29insA 2 2
Kin_pos_reference rs1000022 2 4
Kin_pos_reference rs1000137 4 2
Kin_pos_reference rs10002268 4 4
Expand Down Expand Up @@ -4521,6 +4520,7 @@ Kin_pos_reference rs3117915 1 3
Kin_pos_reference rs3118520 3 3
Kin_pos_reference rs312154 4 3
Kin_pos_reference rs312185 1 2
Kin_pos_reference rs312262906 2 2
Kin_pos_reference rs312272 2 2
Kin_pos_reference rs3124028 4 4
Kin_pos_reference rs3124041 3 3
Expand Down
4 changes: 2 additions & 2 deletions lusSTR/tests/data/kinsnps/snps_kin_all.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10225,6 +10225,8 @@ Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs3124028 0 G G Kintelligence Contains untyped allele
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312272 32 C C Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312272 10 T T Kintelligence Contains untyped allele
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312262906 76 C C Phenotype
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312262906 0 A A Phenotype Contains untyped allele
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312185 152 A A Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312185 146 C C Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312154 69 T T Kintelligence
Expand Down Expand Up @@ -20075,5 +20077,3 @@ Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs100
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs1000137 36 C C Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs1000022 176 C C Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs1000022 142 T T Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method N29insA 76 C C Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method N29insA 0 A A Kintelligence Contains untyped allele
2 changes: 1 addition & 1 deletion lusSTR/tests/data/kinsnps/snps_kin_filtered.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6692,6 +6692,7 @@ Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs3124041 221 G G Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs3124028 96 T T Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312272 32 C C Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312262906 76 C C Phenotype
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312185 152 A A Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312185 146 C C Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs312154 69 T T Kintelligence
Expand Down Expand Up @@ -13115,4 +13116,3 @@ Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs100
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs1000137 36 C C Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs1000022 176 C C Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method rs1000022 142 T T Kintelligence
Kin_pos_reference Kintelligence Test Verogen Kintelligence Analysis Method N29insA 76 C C Kintelligence
Loading

0 comments on commit 1954c19

Please sign in to comment.