Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
aaronleopold committed Mar 31, 2021
1 parent aea9c18 commit 6962c15
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 14 deletions.
15 changes: 14 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,20 @@ $ pip3 install --user -r requirements.txt

## Scripts

There is a selection of scripts available in the [`scripts`](https://github.com/FLMNH-MGCL/digitization/tree/main/scripts) directory. They all have unique CLI structures, so be sure to run whichever is needed with the `--help` flag to get started.
There is a selection of scripts available in the [`scripts`](https://github.com/FLMNH-MGCL/digitization/tree/main/scripts) directory. They all have unique CLI structures, so be sure to run whichever is needed with the `--help` flag to get started. The below table provides a brief overview for each script:

| Script | Description |
| -------------------- | ---------------------------------------------------------------------------------------------------------- |
| `dynaiello.py` | A version of the Aiello script with less column restrictions. Copy and rename entries based on a CSV file. |
| `gene_copy.py` | Removes divergent consensus sequences (IBA pipeline) from .fas/.fasta files |
| `gene_parser.py` | Parses .fa/.fasta files to extract accession numbers and gene names |
| `mgcl_tracker.py` | Tracks the used catalog numbers in the filesystem against a range/csv of numbers |
| `protein_combine.py` | Combines separated protein/nucleotide files into one combined file |
| `relocate.py` | _(deprecated)_ Relocates 'troublesome' images based on the log output of other scripts |
| `suspect_numbers.py` | Agreggates 'suspect' catalog numbers in a filesystem |
| `unique_values.py` | Outputs all the unique values in the columns of a CSV or XLSX file |
| `wls.py` | _(deprecated)_ Generates CSV of specimen at current working directory |
| `wrangler.py` | Assigns BOMBID numbers to collection specimen |

## Digitization Program

Expand Down
3 changes: 0 additions & 3 deletions scripts/gene_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,9 +209,6 @@ def collect_gene_data(self):
if not header_line:
break

# gene_line = f.readline()

# gene = tuple((header_line.strip(), gene_line.strip()))
gene = GeneParser.parse_gene_header(header_line.strip())
self.genes.append(gene)

Expand Down
10 changes: 2 additions & 8 deletions scripts/unique_values.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def verify_grouping(self):
for col in self.group_by:
try:
self.raw_data[col]
except:
except Exception:
error_message(
"{} does not exist in the provided input file".format(col)
)
Expand All @@ -115,7 +115,7 @@ def verify_grouping(self):
for col in self.group_for:
try:
self.raw_data[col]
except:
except Exception:
error_message(
"{} does not exist in the provided input file".format(col)
)
Expand All @@ -134,7 +134,6 @@ def write_out(self):

def write_groups(self):
logfile = Uniquer.generate_logname("UNIQUE_VALUES", self.destination)
# merged = None
for heading, frame in self.unique_frames:
with open(logfile, "a") as f:
f.write(heading + "\n")
Expand All @@ -149,11 +148,6 @@ def write_groups(self):
# print(merged)

def run(self):

# print("\nParsing CSV...\n")
# self.raw_csv_data = pd.read_csv(
# self.csv_path, header=0, encoding="ISO-8859-1", low_memory=False)

if self.group_by:
if self.group_for is not None:
for col in self.group_for:
Expand Down
3 changes: 1 addition & 2 deletions scripts/wls.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,7 @@ def extract_filter(argument_list, arg_len):
def main():
argument_list = sys.argv
arg_len = len(argument_list)

filter = extract_filter(argument_list, arg_len)

if filter[0] == "BAD":
Expand Down Expand Up @@ -274,8 +275,6 @@ def main():
# unknown option
print("Unknown usage.")

# input("Press enter to exit...")


if __name__ == "__main__":
main()

0 comments on commit 6962c15

Please sign in to comment.